DocumentCode :
1988618
Title :
On the Effectiveness of Constraints Sets in Clustering Genes
Author :
Zeng, Erliang ; Yang, Chengyong ; Li, Tao ; Narasimhan, Giri
Author_Institution :
Florida Int. Univ., Miami
fYear :
2007
fDate :
14-17 Oct. 2007
Firstpage :
79
Lastpage :
86
Abstract :
In this paper, we have modified a constrained clustering algorithm to perform exploratory analysis on gene expression data using prior knowledge presented in the form of constraints. We have also studied the effectiveness of various constraints sets. To address the problem of automatically generating constraints from biological text literature, we considered two methods (cluster-based and similarity-based). We concluded that incomplete information in the form of constraints set should be generated carefully, in order to outperform the standard clustering algorithm, which works on the data source without any constraints. For sufficiently large constraints sets, the constrained clustering algorithm outperformed the MSC algorithm. The novelty of research presented here is the study of effectiveness of constraints sets and robustness of the constrained clustering algorithm using multiple sources of biological data, and incorporating biomedical text literature into constrained clustering algorithm in form of constraints sets.
Keywords :
biology computing; genetics; MPCK-means algorithm; cluster-based method; constrained clustering algorithm; constraints sets; gene expression; similarity-based method; Algorithm design and analysis; Bioinformatics; Cities and towns; Clustering algorithms; Gene expression; Genomics; Information analysis; Performance analysis; Proteins; Robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4244-1509-0
Type :
conf
DOI :
10.1109/BIBE.2007.4375548
Filename :
4375548
Link To Document :
بازگشت