• DocumentCode
    2015348
  • Title

    Visualization and Integration of Databases Using Self-Organizing Map

  • Author

    Bourennani, Farid ; Pu, Ken Q. ; Zhu, Ying

  • Author_Institution
    Univ. of Ontario Inst. of Technol., Oshawa, ON
  • fYear
    2009
  • fDate
    1-6 March 2009
  • Firstpage
    155
  • Lastpage
    160
  • Abstract
    With the growing computer networks, accessible data is becoming increasingly distributed. Understanding and integrating remote and unfamiliar data sources are important data management issues. In this paper, we propose to utilize self-organizing maps (SOM) clustering to aid with the visualization of similar columns, and integration of relational database tables and attributes based on the content. In order to accommodate heterogeneous data types found in relational databases, we extended the TFIDF measure to handle, in addition to text, numerical attribute types for coincident meaning extraction. We present a SOM clustering based visualization algorithm allowing the user to browse the heterogeneously typed database attributes and discover semantically similar clusters. Additionally, we propose a new algorithm Common Item Based Classifier (CIBC) to smoothen the homogeneity of the clusters obtained by SOM. The discovered semantic clusters can significantly aid in manual or automated constructions of data integrity constraints in data cleaning or schema mappings in data integration.
  • Keywords
    data integrity; data mining; data visualisation; distributed databases; pattern classification; pattern clustering; query processing; relational databases; self-organising feature maps; common item based classifier algorithm; data integrity constraint; data visualization algorithm; distributed database browsing; heterogeneous data type; numerical data mining; relational database table; self-organizing map clustering; Application software; Clustering algorithms; Computer network management; Data mining; Data visualization; Distributed databases; Information retrieval; Relational databases; Self organizing feature maps; Visual databases; Common Item Based Classifier (CIBC); Data Integration; Information Retrieval (IR); SOM;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Databases, Knowledge, and Data Applications, 2009. DBKDA '09. First International Conference on
  • Conference_Location
    Gosier
  • Print_ISBN
    978-1-4244-3467-1
  • Electronic_ISBN
    978-0-7695-3550-0
  • Type

    conf

  • DOI
    10.1109/DBKDA.2009.30
  • Filename
    5071828