• DocumentCode
    495468
  • Title

    A Method to Automatically Discover and Classify Deep Web Data Source Using Multi-Classifier

  • Author

    Zhi-tao, Li ; Quan, Liu ; Zhi-ming, Cui ; Yu-chen, Fu

  • Author_Institution
    Inst. of Comput. Sci. & Technol., Soochow Univ., Suzhou, China
  • Volume
    3
  • fYear
    2009
  • fDate
    March 31 2009-April 2 2009
  • Firstpage
    736
  • Lastpage
    740
  • Abstract
    Recently, the discovery of deep Web data source and domain-relevant issue attract more and more attentions. This paper proposed a method using multi-classifier to discover and classify the data source of deep Web. Firstly, It used naive Bayes classifier to class the page into domain relevance or not. Secondly, improved C4.5 decision tree algorithm was used to identify the query interface. The result of the experiment competed with single decision tree classifier proved this method is effective.
  • Keywords
    Bayes methods; Internet; decision trees; pattern classification; query processing; data classification; data discovery; data multiclassifier; data source; decision tree; deep Web; naive Bayes classifier; query interface; Classification tree analysis; Computer science; Crawlers; Data engineering; Data mining; Databases; Decision trees; Frequency; HTML; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Engineering, 2009 WRI World Congress on
  • Conference_Location
    Los Angeles, CA
  • Print_ISBN
    978-0-7695-3507-4
  • Type

    conf

  • DOI
    10.1109/CSIE.2009.435
  • Filename
    5170939