Title :
Complex Synonymous Matchings Based on Correlation Mining
Author :
Jie, Liu ; Nianbin, Wang ; Fujiang, Liu ; Yangyao, Zhao
Author_Institution :
Coll. of Comput. Sci. & Technol., Harbin Eng. Univ., Harbin, China
Abstract :
In recent years, with the virtually unlimited amount of information sources, the deep Web is clearly becoming an important frontier for data integration. Schema matching is fundamental for supporting query mediation across deep Web sources. To integrate the millions of heterogeneous information sources and complete synonymous matching among the query interfaces, this paper developed an automatically synonymous matching framework, which consists of XML documents data preparation, synonymous matching discovery and finally matching selection. The framework incorporates correlation mining and schema matching, and it can be used to mine complex m: n matchings beside simple 1:1 matchings. By observing the query interfaces on the deep Web, we found the matched attributes or attribute groups which express the same semantic information rarely co-occur in the same query interface. This insight enables us to discover complex synonymous matching by using a correlation mining approach. Moreover, we proposed a new correlation algorithm to ensure the accuracy of synonymous matching.
Keywords :
Internet; XML; data mining; pattern matching; query processing; user interfaces; XML document data preparation; complex synonymous matching discovery; correlation mining; data integration; deep Web query interface; heterogeneous information source; query mediation; schema matching; Application software; Books; Computer science; Data engineering; Educational institutions; Information resources; Internet; Mediation; Pediatrics; XML; correlation measure; correlation mining; deep web; schema matching;
Conference_Titel :
Interoperability for Enterprise Software and Applications China, 2009. IESA '09. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-0-7695-3652-1
DOI :
10.1109/I-ESA.2009.21