• DocumentCode
    2348332
  • Title

    Automatic extraction of Web search interface based on visual features

  • Author

    Zhang, Yu-lian ; Qiao, Jing-yang

  • Author_Institution
    Inf. Sci. & Eng. Coll., Yanshan Univ., Qinhuangdao
  • fYear
    2008
  • fDate
    3-5 June 2008
  • Firstpage
    2288
  • Lastpage
    2291
  • Abstract
    Ordinarily, a Web query interface can be considered as an interface schema containing multiple attributes and rich semantic/meta information, however, the schema is not formally defined in HTML. We observed that most Web pages have many visual cues to help distinguish different parts of the page. In this paper, we propose a novel approach to solve the extraction problem of query interface. Firstly, we propose a schema model for representing complex search interfaces. Secondly, we present an approach based on visual feature to automatically extract the search interfaces from Web page. It simulates how a user understands Web search interface based on its visual perception. Our experimental results indicate that the visual feature approach can work significantly better than the baselines in search interface extraction and achieve very high extraction accuracy.
  • Keywords
    Internet; hypermedia markup languages; user interfaces; HTML; Web page; Web query interface; Web search interface extraction; interface schema; metainformation; search interface representation; semantic information; visual cues; visual features; visual perception; Books; Data mining; Feature extraction; HTML; Neodymium; Noise level; Spatial databases; Visual databases; Web pages; Web search; Search Interfaces Extraction; Visual Feature; Web Database;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Electronics and Applications, 2008. ICIEA 2008. 3rd IEEE Conference on
  • Conference_Location
    Singapore
  • Print_ISBN
    978-1-4244-1717-9
  • Electronic_ISBN
    978-1-4244-1718-6
  • Type

    conf

  • DOI
    10.1109/ICIEA.2008.4582925
  • Filename
    4582925