Title :
A Machine Learning Approach Classification of Deep Web Sources
Author :
Xu, Hexiang ; Zhang, Chenghong ; Hao, Xiulan ; Hu, Yunfa
Author_Institution :
Fudan Univ., Shanghai
Abstract :
The classification of deep Web sources is an important area in large-scale deep Web integration, which is still at an early stage. Many deep web sources are structured by providing structured query interfaces and results. Classifying such structured sources into domains is one of the critical steps toward the integration of heterogeneous Web sources. To date, in terms of the classification, existing works mainly focus on classifying texts or Web documents, and there is little in the deep web. In this paper, we present a deep Web model and machine learning based classifying model. The experimental results show that we can achieve a good performance with a small scale training samples for each domain, and as the number of training samples increases, the performance keeps stabilization.
Keywords :
Internet; classification; learning (artificial intelligence); deep Web sources classification; machine learning; Databases; Information technology; Internet; Large scale integration; Machine learning; Oceans; Sea surface; Search engines; Technology management; Web pages;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
DOI :
10.1109/FSKD.2007.54