DocumentCode :
2898732
Title :
Discovering Complex Semantic Matches between Database Schemas
Author :
Qian, Ying ; Li, Yuxiang ; Song, Jinling ; Yue, Liwen
Author_Institution :
Network & Modern Educ. Technol. Center, Normal Univ. of Sci. & Technol., Qin Huangdao, China
fYear :
2009
fDate :
7-8 Nov. 2009
Firstpage :
756
Lastpage :
760
Abstract :
Schema matching, the problem of finding semantic correspondences between elements of two schemas, plays a key role in many applications, such as data warehouse, heterogeneous data sources integration and semantic Web. The existing approaches to automating schema matching almost focus on computing direct element matches (1:1 matches) between two schemas. However, relationships between real-world schemas involve many complex matches besides 1:1 matches. At present, there are few methods can discover complex matches, such as iMAP, but they have bad matching efficiency, because the candidate matches space is so large which they need searching. A complex schema matching system called CSM is introduced in this paper. Firstly it can filter unreasonable matches on data types and values by preprocessor and clustering processor, and employs a set of special-purpose searchers in match generator to explore a specialized portion of the search space and discovers 1:1 and complex matches. Then it estimates candidate matches and selects optimal candidate matches by using similarity estimator and match selector respectively. Finally, according to the problem that there are opaque columns in the schemas being matched, it can apply complementary matcher to discover matching relations between opaque columns further more. Thereby it can discover more general, reasonable matching pairs. Experiments show that, CSM does not only discover matches between schemas roundly, but also improve the matching recall and precision in practice.
Keywords :
data mining; database management systems; pattern matching; complementary matcher; complex schema matching system; complex semantic matches; data warehouse; database schemas; heterogeneous data sources integration; match selector; opaque columns; optimal candidate matches; searching; semantic Web; similarity estimator; special-purpose searchers; Computer science education; Costs; Data warehouses; Educational institutions; Educational technology; Euclidean distance; Information systems; Matched filters; Relational databases; Semantic Web; clustering; complex matching; schema matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Information Systems and Mining, 2009. WISM 2009. International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3817-4
Type :
conf
DOI :
10.1109/WISM.2009.156
Filename :
5368370
Link To Document :
بازگشت