Title :
Combining Clustering and Bayesian Network for Gene Network Inference
Author :
Zainudin, Suhaila ; Deris, Safaai
Author_Institution :
Fakulti Teknol. dan Sains Maklumat, Univ. Kebangsaan Malaysia, Bangi
Abstract :
Gene network reconstruction is a multidisciplinary research area involving data mining, machine learning, statistics, ontologies and others. Reconstructed gene network allows us to understand how genes interact with each other. However, network construction is very complex due to highly interactive nature of genes. A proposed approach to solve this complex problem is to cluster the genes according to similarity in their gene expression profiles. We applied k-means clustering with k = 10 to come up with ten clusters of genes. Then, we applied Bayesian Network structure learning with Hill-climbing search strategy and Akaike Information Criterion score to search for the best network. We compared inferred interactions to a reference positive interactions dataset and found similarities between our inferred interactions and the reference. We further study the gene interactions using Gene Ontology. From our findings, we conclude that the clustering step is essential in gene network reconstruction. Clustering produced better group of genes for Bayesian Network learning. Larger clusters also produced more gene interactions. Gene Ontology can be combined with clustering to produce better quality clusters to improve gene network construction.
Keywords :
belief networks; biology computing; data mining; learning (artificial intelligence); ontologies (artificial intelligence); pattern clustering; Akaike information criterion score; Bayesian network structure learning; Hill-climbing search; data mining; gene expression profiles; gene network inference; gene network reconstruction; gene ontology; k-means clustering; machine learning; ontologies; statistics; Bayesian methods; Clustering algorithms; Data mining; Gene expression; Machine learning; Ontologies; Organisms; Partitioning algorithms; Sequences; Systems biology; Bayesian Network; Gene expression data; clustering;
Conference_Titel :
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Print_ISBN :
978-0-7695-3382-7
DOI :
10.1109/ISDA.2008.183