DocumentCode :
1065807
Title :
Biclustering algorithms for biological data analysis: a survey
Author :
Madeira, Sara C. ; Oliveira, Arlindo L.
Author_Institution :
Beira Interior Univ., Covilha, Portugal
Volume :
1
Issue :
1
fYear :
2004
Firstpage :
24
Lastpage :
45
Abstract :
A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the existence of a number of experimental conditions where the activity of genes is uncorrelated. A similar limitation exists when clustering of conditions is performed. For this reason, a number of algorithms that perform simultaneous clustering on the row and column dimensions of the data matrix has been proposed. The goal is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this paper, we refer to this class of algorithms as biclustering. Biclustering is also referred in the literature as coclustering and direct clustering, among others names, and has also been used in fields such as information retrieval and data mining. In this comprehensive survey, we analyze a large number of existing approaches to biclustering, and classify them in accordance with the type of biclusters they can find, the patterns of biclusters that are discovered, the methods used to perform the search, the approaches used to evaluate the solution, and the target applications.
Keywords :
biology computing; genetics; molecular biophysics; pattern clustering; biclustering algorithms; biological data analysis; coclustering; data mining; direct clustering; gene expression; gene subgroups; information retrieval; microarray experiments; Clustering algorithms; Clustering methods; Data analysis; Data mining; Gene expression; Information retrieval; Pattern analysis; Performance analysis; Performance evaluation; Semiconductor device measurement; Biclustering; bidimensional clustering; biological data analysis; block clustering; coclustering; direct clustering; gene expression data.; microarray data analysis; simultaneous clustering; subspace clustering; two-mode clustering; two-sided clustering; two-way clustering; Algorithms; Cluster Analysis; Computational Biology; Gene Expression; Gene Expression Profiling; Humans; Models, Statistical; Oligonucleotide Array Sequence Analysis; Saccharomyces cerevisiae;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2004.2
Filename :
1324618
Link To Document :
بازگشت