Title :
Automatic structured Web databases classification
Author :
Cui, XiaoJun ; Ren, ZhongSheng ; Xiao, HongYu ; Le Xu
Author_Institution :
Wenzhou Vocational Coll. of Sci. & Technol., Wenzhou, China
Abstract :
The growing structured Web databases on the web, making large-scale Deep Web data integration faces enormous challenges. Organizing such structured web databases into a hierarchy directory tree is one of critical step towards the large-scale integration of Deep Web. In this paper, a method for automatic classification of Web database is addressed. Firstly, the method for calculating the semantic similarities among the Web databases based on their interface schemas is proposed and be translated to the problem of extended optimal matching for bipartite graph. Then based on the achieved similarity matrix, an agglomerative hierarchical clustering algorithm is proposed, which can organize the Web databases into a hierarchy tree automatically. Theoretical analysis and experimental results show that the method is efficient.
Keywords :
Web sites; database management systems; pattern classification; trees (mathematics); Web data integration; agglomerative hierarchical clustering algorithm; automatic structured Web databases classification; bipartite graph; extended optimal matching; hierarchy directory tree; interface scheme; similarity matrix; Artificial neural networks; Databases; Motion pictures; Nickel; bipartite graph matching; hierarchical clustering; interface schema; web databases;
Conference_Titel :
Intelligent Computing and Intelligent Systems (ICIS), 2010 IEEE International Conference on
Conference_Location :
Xiamen
Print_ISBN :
978-1-4244-6582-8
DOI :
10.1109/ICICISYS.2010.5658701