Title :
An effective keyword extraction method for videos in web pages by analyzing their layout structures
Author :
Lee, Jongwon ; Choi, Giseok ; Jang, Juyeon ; Nang, Jongho
Author_Institution :
Jongwon Lee Chungkang Coll., Gyunggi-do
fDate :
Oct. 30 2007-Nov. 2 2007
Abstract :
This paper proposes an effective keyword extraction method for the Web videos by analyzing the structure of the Web pages. The proposed scheme calculates the relative importance (or weights) of the text blocks to a video by analyzing the distances of the text blocks to the video. This distance, called the layout distance, indicates a degree of relevance of text block to video, and could be estimated by analyzing the layout structure of Web pages. Since the Web pages with several videos such as Web pages posting UCC videos have a special layout structure, this layout analysis helps to precisely estimate the relevance of text block to the video. This weight of text block is used to compute the final weights of keywords extracted from that text block by analyzing their HTML tags and other well-known techniques such as TF/IDF. Some experiments with 1,087 Web pages that have total 2,462 videos show that the precision of the proposed extraction scheme is 17% higher than ImageRover.
Keywords :
Web design; hypermedia markup languages; portals; video retrieval; HTML tags; ImageRover; UCC videos; Web pages; Web videos; keyword extraction method; layout distance; layout structure; Clustering algorithms; Data mining; Educational institutions; HTML; Image analysis; Ontologies; User-generated content; Videos; Web pages; World Wide Web;
Conference_Titel :
TENCON 2007 - 2007 IEEE Region 10 Conference
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-1272-3
Electronic_ISBN :
978-1-4244-1272-3
DOI :
10.1109/TENCON.2007.4428794