DocumentCode
2015348
Title
Visualization and Integration of Databases Using Self-Organizing Map
Author
Bourennani, Farid ; Pu, Ken Q. ; Zhu, Ying
Author_Institution
Univ. of Ontario Inst. of Technol., Oshawa, ON
fYear
2009
fDate
1-6 March 2009
Firstpage
155
Lastpage
160
Abstract
With the growing computer networks, accessible data is becoming increasingly distributed. Understanding and integrating remote and unfamiliar data sources are important data management issues. In this paper, we propose to utilize self-organizing maps (SOM) clustering to aid with the visualization of similar columns, and integration of relational database tables and attributes based on the content. In order to accommodate heterogeneous data types found in relational databases, we extended the TFIDF measure to handle, in addition to text, numerical attribute types for coincident meaning extraction. We present a SOM clustering based visualization algorithm allowing the user to browse the heterogeneously typed database attributes and discover semantically similar clusters. Additionally, we propose a new algorithm Common Item Based Classifier (CIBC) to smoothen the homogeneity of the clusters obtained by SOM. The discovered semantic clusters can significantly aid in manual or automated constructions of data integrity constraints in data cleaning or schema mappings in data integration.
Keywords
data integrity; data mining; data visualisation; distributed databases; pattern classification; pattern clustering; query processing; relational databases; self-organising feature maps; common item based classifier algorithm; data integrity constraint; data visualization algorithm; distributed database browsing; heterogeneous data type; numerical data mining; relational database table; self-organizing map clustering; Application software; Clustering algorithms; Computer network management; Data mining; Data visualization; Distributed databases; Information retrieval; Relational databases; Self organizing feature maps; Visual databases; Common Item Based Classifier (CIBC); Data Integration; Information Retrieval (IR); SOM;
fLanguage
English
Publisher
ieee
Conference_Titel
Advances in Databases, Knowledge, and Data Applications, 2009. DBKDA '09. First International Conference on
Conference_Location
Gosier
Print_ISBN
978-1-4244-3467-1
Electronic_ISBN
978-0-7695-3550-0
Type
conf
DOI
10.1109/DBKDA.2009.30
Filename
5071828
Link To Document