Title :
Automatic Generation of Integration and Preprocessing Ontologies for Biomedical Sources in a Distributed Scenario
Author :
Anguita, Alberto ; Perez-Rey, David ; Crespo, José ; Maojo, Víctor
Author_Institution :
Biomed. Inf. Group, Univ. Politec. de Madrid, Madrid
Abstract :
Access to a large number of remote data sources has boosted research in biomedicine, where different biological and clinical research projects are based on collaborative efforts among international organizations. In this scenario, the authors have developed various methods and tools in the area of database integration, using an ontological approach. This paper describes a method to automatically generate preprocessing structures (ontologies) within an ontology-based KDD model. These ontologies are obtained from the analysis of data sources, searching for: (i) valid numerical ranges (using clustering techniques), (ii) different scales, (iii) synonym transformations based on known dictionaries and (iv) typographical errors. To test the method, experiments were carried out with four biomedical databases -containing rheumatoid arthritis, gene expression patterns, biological processes and breast cancer patients- proving the performance of the approach. This method supports experts in data analysis processes, facilitating the detection of inconsistencies.
Keywords :
data analysis; database management systems; distributed processing; medical information systems; ontologies (artificial intelligence); pattern clustering; automatic generation; biological processes; biomedical database; biomedical sources; breast cancer patients; clustering technique; collaborative efforts; data analysis; database integration; distributed scenario; gene expression patterns; ontology-based KDD model; preprocessing ontologies; remote data sources; rheumatoid arthritis; synonym transformation; typographical error; valid numerical ranges; Arthritis; Biological processes; Biological system modeling; Data analysis; Databases; Dictionaries; Gene expression; International collaboration; Ontologies; Testing; Integration; KDD; Ontologies; Preprocessing;
Conference_Titel :
Computer-Based Medical Systems, 2008. CBMS '08. 21st IEEE International Symposium on
Conference_Location :
Jyvaskyla
Print_ISBN :
978-0-7695-3165-6
DOI :
10.1109/CBMS.2008.71