DocumentCode :
1640811
Title :
Parallel integration of heterogeneous genome-wide data sources
Author :
Greene, Derek ; Bryan, Kenneth ; Cunningham, Pádraig
Author_Institution :
Sch. of Comput. Sci. & Inf., Univ. Coll. Dublin, Dublin
fYear :
2008
Firstpage :
1
Lastpage :
7
Abstract :
Heterogeneous genome-wide data sources capture information on various aspects of complex biological systems. For instance, transcriptome, interactome and phenome-level information may be derived from mRNA expression data, protein-protein interaction networks, and biomedical literature corpora. Each source provides a distinct ldquoviewrdquo of the same domain, but potentially encodes different biologically-relevant patterns. Effective integration of such views can provide a richer, more informative model of an organismpsilas functional modules than that produced on a single view alone. Existing machine learning strategies for information fusion largely focus on the production of a consensus model that reflects patterns shared between views. However, the information provided by different views may not always be easily reconciled, due to the incomplete nature of the data, or the fact that some patterns will be present in one view but not in another. To address this problem, we present the Parallel Integration Clustering Algorithm (PICA), a novel cluster analysis approach which supports the simultaneous integration of information from two or more sources. The resulting model preserves patterns that are unique to individual views, as well as those common to all views. We demonstrate the effectiveness of PICA in identifying significant patterns corresponding to functional groupings, when applied to three genome-wide datasets.
Keywords :
bioinformatics; genomics; learning (artificial intelligence); parallel algorithms; proteins; PICA method; Parallel Integration Clustering Algorithm; biomedical literature corpora; complex biological systems; heterogeneous genome-wide data sources; information capture; interactome level information; mRNA expression; machine learning; phenome level information; protein-protein interaction networks; transcriptome level information; Bioinformatics; Biological information theory; Biological system modeling; Biological systems; Clustering algorithms; Genomics; Information analysis; Machine learning; Production; Proteins;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
BioInformatics and BioEngineering, 2008. BIBE 2008. 8th IEEE International Conference on
Conference_Location :
Athens
Print_ISBN :
978-1-4244-2844-1
Electronic_ISBN :
978-1-4244-2845-8
Type :
conf
DOI :
10.1109/BIBE.2008.4696710
Filename :
4696710
Link To Document :
بازگشت