• DocumentCode
    2905061
  • Title

    Automatic Deep Web Query Results Extraction Based on Tag Trees

  • Author

    Xie, Ying ; Zuo, Wanli ; He, Fengling ; Wang, Ying

  • Author_Institution
    Coll. of Comput. Sci. & Technol., Jinlin Univ., Changchun, China
  • Volume
    2
  • fYear
    2009
  • fDate
    12-14 Dec. 2009
  • Firstpage
    308
  • Lastpage
    311
  • Abstract
    Automatic deep Web query results extraction is a key step of deep Web query results processing. Extracting the query results correctly is the precondition and guarantee of realizing semantic annotation and data integration. In this paper, a simple method for extracting deep Web query results automatically based on tag trees is proposed according to the features of deep Web query results page. The method first builds a tag tree of the given result page. Then finds minimal data regions in the tag tree from top to down, and extracts data records included by them. The experiment has shown that the method is effective.
  • Keywords
    Internet; query processing; trees (mathematics); automatic deep Web query results extraction; data integration; semantic annotation; tag trees; Computational intelligence; Computer science; Computer science education; Data mining; Educational institutions; Educational technology; HTML; Knowledge engineering; Laboratories; Web pages; data records; minimal data regions; query results; tag trees;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Design, 2009. ISCID '09. Second International Symposium on
  • Conference_Location
    Changsha
  • Print_ISBN
    978-0-7695-3865-5
  • Type

    conf

  • DOI
    10.1109/ISCID.2009.223
  • Filename
    5368727