Title :
Summarizing ontology-based schemas in PDMS
Author :
Pires, Carlos Eduardo ; Sousa, Paulo ; Kedad, Zoubida ; Salgado, Ana Carolina
Author_Institution :
Comput. Sci. Dept., Fed. Univ. of Campina Grande (UFCG), Campina Grande, Brazil
Abstract :
Quickly understanding the content of a data source is very useful in several contexts. In a Peer Data Management System (PDMS), peers can be semantically clustered, each cluster being represented by a schema obtained by merging the local schemas of the peers in this cluster. In this paper, we present a process for summarizing schemas of peers participating in a PDMS. We assume that all the schemas are represented by ontologies and we propose a summarization algorithm which produces a summary containing the maximum number of relevant concepts and the minimum number of non-relevant concepts of the initial ontology. The relevance of a concept is determined using the notions of centrality and frequency. Since several possible candidate summaries can be identified during the summarization process, classical Information Retrieval metrics are employed to determine the best summary.
Keywords :
database management systems; information retrieval; ontologies (artificial intelligence); centrality notion; frequency notion; information retrieval metrics; ontology-based schemas; peer data management system; summarization algorithm; Clustering algorithms; Computer science; Content management; Databases; Frequency; Informatics; Information retrieval; Large scale integration; Merging; Ontologies;
Conference_Titel :
Data Engineering Workshops (ICDEW), 2010 IEEE 26th International Conference on
Conference_Location :
Long Beach, CA
Print_ISBN :
978-1-4244-6522-4
Electronic_ISBN :
978-1-4244-6521-7
DOI :
10.1109/ICDEW.2010.5452706