Title :
Automatic extraction of Web search interface based on visual features
Author :
Zhang, Yu-lian ; Qiao, Jing-yang
Author_Institution :
Inf. Sci. & Eng. Coll., Yanshan Univ., Qinhuangdao
Abstract :
Ordinarily, a Web query interface can be considered as an interface schema containing multiple attributes and rich semantic/meta information, however, the schema is not formally defined in HTML. We observed that most Web pages have many visual cues to help distinguish different parts of the page. In this paper, we propose a novel approach to solve the extraction problem of query interface. Firstly, we propose a schema model for representing complex search interfaces. Secondly, we present an approach based on visual feature to automatically extract the search interfaces from Web page. It simulates how a user understands Web search interface based on its visual perception. Our experimental results indicate that the visual feature approach can work significantly better than the baselines in search interface extraction and achieve very high extraction accuracy.
Keywords :
Internet; hypermedia markup languages; user interfaces; HTML; Web page; Web query interface; Web search interface extraction; interface schema; metainformation; search interface representation; semantic information; visual cues; visual features; visual perception; Books; Data mining; Feature extraction; HTML; Neodymium; Noise level; Spatial databases; Visual databases; Web pages; Web search; Search Interfaces Extraction; Visual Feature; Web Database;
Conference_Titel :
Industrial Electronics and Applications, 2008. ICIEA 2008. 3rd IEEE Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-1717-9
Electronic_ISBN :
978-1-4244-1718-6
DOI :
10.1109/ICIEA.2008.4582925