DocumentCode :
2698142
Title :
Web Page Element Classification Based on Visual Features
Author :
Burget, Radek ; Rudolfová, Ivana
Author_Institution :
Fac. of Inf. Technol., Brno Univ. of Technol., Brno, Czech Republic
fYear :
2009
fDate :
1-3 April 2009
Firstpage :
67
Lastpage :
72
Abstract :
When applying the traditional data mining methods to World Wide Web documents, the typical problem is that a normal Web page contains a variety of information of different kinds in addition to its main content. This additional information such as navigation, advertisement or copyright notices negatively influences the results of the data mining methods as for example the content classification. In this paper, we present a method of interesting area detection in a Web page. This method is inspired by an assumed human reader approach to this task. First, basic visual blocks are detected in the page and subsequently, the purpose of these blocks is guessed based on their visual appearance. We describe a page segmentation method used for the visual block detection, we propose a way of the block classification based on the visual features and finally, we provide an experimental evaluation of the method on real-world data.
Keywords :
Internet; classification; data mining; image segmentation; Web page element classification; Web page interesting area detection; Web page segmentation; World Wide Web documents; content classification; data mining; visual block detection; Data mining; Database systems; Deductive databases; HTML; Humans; Information retrieval; Information technology; Navigation; Web pages; Web sites; classification; page segmentation; preprocessing; visual blocks; visual features;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Information and Database Systems, 2009. ACIIDS 2009. First Asian Conference on
Conference_Location :
Dong Hoi
Print_ISBN :
978-0-7695-3580-7
Type :
conf
DOI :
10.1109/ACIIDS.2009.71
Filename :
5175969
Link To Document :
بازگشت