DocumentCode
495468
Title
A Method to Automatically Discover and Classify Deep Web Data Source Using Multi-Classifier
Author
Zhi-tao, Li ; Quan, Liu ; Zhi-ming, Cui ; Yu-chen, Fu
Author_Institution
Inst. of Comput. Sci. & Technol., Soochow Univ., Suzhou, China
Volume
3
fYear
2009
fDate
March 31 2009-April 2 2009
Firstpage
736
Lastpage
740
Abstract
Recently, the discovery of deep Web data source and domain-relevant issue attract more and more attentions. This paper proposed a method using multi-classifier to discover and classify the data source of deep Web. Firstly, It used naive Bayes classifier to class the page into domain relevance or not. Secondly, improved C4.5 decision tree algorithm was used to identify the query interface. The result of the experiment competed with single decision tree classifier proved this method is effective.
Keywords
Bayes methods; Internet; decision trees; pattern classification; query processing; data classification; data discovery; data multiclassifier; data source; decision tree; deep Web; naive Bayes classifier; query interface; Classification tree analysis; Computer science; Crawlers; Data engineering; Data mining; Databases; Decision trees; Frequency; HTML; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Information Engineering, 2009 WRI World Congress on
Conference_Location
Los Angeles, CA
Print_ISBN
978-0-7695-3507-4
Type
conf
DOI
10.1109/CSIE.2009.435
Filename
5170939
Link To Document