DocumentCode
2094472
Title
Automatic Generation of Integration and Preprocessing Ontologies for Biomedical Sources in a Distributed Scenario
Author
Anguita, Alberto ; Perez-Rey, David ; Crespo, José ; Maojo, Víctor
Author_Institution
Biomed. Inf. Group, Univ. Politec. de Madrid, Madrid
fYear
2008
fDate
17-19 June 2008
Firstpage
336
Lastpage
341
Abstract
Access to a large number of remote data sources has boosted research in biomedicine, where different biological and clinical research projects are based on collaborative efforts among international organizations. In this scenario, the authors have developed various methods and tools in the area of database integration, using an ontological approach. This paper describes a method to automatically generate preprocessing structures (ontologies) within an ontology-based KDD model. These ontologies are obtained from the analysis of data sources, searching for: (i) valid numerical ranges (using clustering techniques), (ii) different scales, (iii) synonym transformations based on known dictionaries and (iv) typographical errors. To test the method, experiments were carried out with four biomedical databases -containing rheumatoid arthritis, gene expression patterns, biological processes and breast cancer patients- proving the performance of the approach. This method supports experts in data analysis processes, facilitating the detection of inconsistencies.
Keywords
data analysis; database management systems; distributed processing; medical information systems; ontologies (artificial intelligence); pattern clustering; automatic generation; biological processes; biomedical database; biomedical sources; breast cancer patients; clustering technique; collaborative efforts; data analysis; database integration; distributed scenario; gene expression patterns; ontology-based KDD model; preprocessing ontologies; remote data sources; rheumatoid arthritis; synonym transformation; typographical error; valid numerical ranges; Arthritis; Biological processes; Biological system modeling; Data analysis; Databases; Dictionaries; Gene expression; International collaboration; Ontologies; Testing; Integration; KDD; Ontologies; Preprocessing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer-Based Medical Systems, 2008. CBMS '08. 21st IEEE International Symposium on
Conference_Location
Jyvaskyla
ISSN
1063-7125
Print_ISBN
978-0-7695-3165-6
Type
conf
DOI
10.1109/CBMS.2008.71
Filename
4562013
Link To Document