DocumentCode :
2950320
Title :
Biclustering of DNA microarray data with early pruning
Author :
Tewfik, Ahmed H. ; Tchagang, Alain B.
Author_Institution :
Dept. of Electr. & Comput. Eng., Minnesota Univ., USA
Volume :
5
fYear :
2005
fDate :
18-23 March 2005
Abstract :
Uncovering genetic pathways is equivalent to finding clusters of genes with expression levels that evolve coherently under subsets of conditions. This can be done by applying a biclustering procedure to gene expression data. We propose a new biclustering procedure that derives biclusters from candidate subsets of conditions. These candidate subsets of conditions are identified by comparing pairs of gene expression data. To reduce complexity, the procedure discards early in the candidate subset of conditions formation stage any subset that is predicted to have less than a desired minimum number of conditions. When the biclusters are required to have more than a minimum number of genes, we show that further reduction in complexity can be achieved with no loss of performance by comparing each gene with only a subset of all genes. The proposed approach finds all genes expression levels that evolve coherently under each of the candidate subsets of conditions using a fast approximate pattern matching technique. This approximate pattern matching procedure can find a pattern in a list even if instances of the pattern in the list have random insertions of characters between consecutive characters in the pattern. As compared to prior techniques, the approach finds all maximum size biclusters with a number of conditions greater than a specified minimum. It has a run time equivalent to the fastest of these techniques, even though the fastest biclustering techniques are not guaranteed to find all biclusters.
Keywords :
DNA; genetics; pattern clustering; pattern matching; DNA microarray data biclustering; approximate pattern matching technique; early pruning; gene expression data; genetic pathways; Clustering algorithms; DNA; Data analysis; Diseases; Displays; Gene expression; Genetic expression; Pattern matching; Performance loss;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8874-7
Type :
conf
DOI :
10.1109/ICASSP.2005.1416418
Filename :
1416418
Link To Document :
بازگشت