DocumentCode :
1114545
Title :
Identification of Regulatory Modules in Time Series Gene Expression Data Using a Linear Time Biclustering Algorithm
Author :
Madeira, Sara C. ; Teixeira, Miguel C. ; Sá-Correia, Isabel ; Oliveira, Arlindo L.
Author_Institution :
Dept. de Informdtica, Univ. da Beira Interior, Covilha, Portugal
Volume :
7
Issue :
1
fYear :
2010
Firstpage :
153
Lastpage :
165
Abstract :
Although most biclustering formulations are NP-hard, in time series expression data analysis, it is reasonable to restrict the problem to the identification of maximal biclusters with contiguous columns, which correspond to coherent expression patterns shared by a group of genes in consecutive time points. This restriction leads to a tractable problem. We propose an algorithm that finds and reports all maximal contiguous column coherent biclusters in time linear in the size of the expression matrix. The linear time complexity of CCC-Biclustering relies on the use of a discretized matrix and efficient string processing techniques based on suffix trees. We also propose a method for ranking biclusters based on their statistical significance and a methodology for filtering highly overlapping and, therefore, redundant biclusters. We report results in synthetic and real data showing the effectiveness of the approach and its relevance in the discovery of regulatory modules. Results obtained using the transcriptomic expression patterns occurring in Saccharomyces cerevisiae in response to heat stress show not only the ability of the proposed methodology to extract relevant information compatible with documented biological knowledge but also the utility of using this algorithm in the study of other environmental stresses and of regulatory modules in general.
Keywords :
biology computing; genetics; molecular biophysics; CCC-biclustering; Saccharomyces cerevisiae; biological knowledge; discretized matrix; environmental stresses; gene expression data; linear time biclustering algorithm; linear time complexity; maximal contiguous column coherent biclusters; regulatory module identification; regulatory modules; string processing techniques; time series expression data analysis; transcriptomic expression patterns; Biclustering; expression patterns; linear time biclustering algorithm; regulatory modules; regulatory modules.; time series gene expression data; Algorithms; Cluster Analysis; Gene Expression Profiling; Gene Expression Regulation; Pattern Recognition, Automated; Proteome; Regulatory Elements, Transcriptional; Signal Transduction;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2008.34
Filename :
4479442
Link To Document :
بازگشت