Title :
On clustering biological data using unsupervised and semi-supervised message passing
Author :
Geng, Huimin ; Deng, Xutao ; Bastola, Dhundy ; Ali, Hesham
Author_Institution :
Dept. of Pathology & Microbiol., Nebraska Univ., Omaha, NE, USA
Abstract :
Noticing that unsupervised clustering may produce clusters that are irrelevant to the research hypotheses and interests, we generalize traditional unsupervised clustering into semi-supervised clustering based on our previously proposed message passing clustering (MPC). In the semi-supervised MPC, prior knowledge such as instance-level and attribute-level constraints are used to guide the clustering process towards better and interpretable partitions. We applied the unsupervised MPC ( background) to phylogenetic analysis of Mycobacterium and the semi-supervised MPC to colon cancer microarray data analysis. The results show that MPC is superior to the widely accepted neighbor-joining and hierarchical clustering methods, and the semi-supervised MPC is even more powerful in biological data analysis such as gene selection and cancer diagnosis using microarray.
Keywords :
cancer; genetics; medical diagnostic computing; message passing; patient diagnosis; statistical analysis; Mycobacterium; attribute-level constraints; biological data clustering; cancer diagnosis; colon cancer microarray data analysis; gene selection; hierarchical clustering; instance-level constraints; message passing clustering; neighbor-joining clustering; phylogenetic analysis; semi-supervised clustering; unsupervised clustering; Bioinformatics; Cancer; Clustering algorithms; Data analysis; Data mining; Machine learning algorithms; Message passing; Ontologies; Partitioning algorithms; Phylogeny;
Conference_Titel :
Bioinformatics and Bioengineering, 2005. BIBE 2005. Fifth IEEE Symposium on
Print_ISBN :
0-7695-2476-1
DOI :
10.1109/BIBE.2005.44