DocumentCode
2950320
Title
Biclustering of DNA microarray data with early pruning
Author
Tewfik, Ahmed H. ; Tchagang, Alain B.
Author_Institution
Dept. of Electr. & Comput. Eng., Minnesota Univ., USA
Volume
5
fYear
2005
fDate
18-23 March 2005
Abstract
Uncovering genetic pathways is equivalent to finding clusters of genes with expression levels that evolve coherently under subsets of conditions. This can be done by applying a biclustering procedure to gene expression data. We propose a new biclustering procedure that derives biclusters from candidate subsets of conditions. These candidate subsets of conditions are identified by comparing pairs of gene expression data. To reduce complexity, the procedure discards early in the candidate subset of conditions formation stage any subset that is predicted to have less than a desired minimum number of conditions. When the biclusters are required to have more than a minimum number of genes, we show that further reduction in complexity can be achieved with no loss of performance by comparing each gene with only a subset of all genes. The proposed approach finds all genes expression levels that evolve coherently under each of the candidate subsets of conditions using a fast approximate pattern matching technique. This approximate pattern matching procedure can find a pattern in a list even if instances of the pattern in the list have random insertions of characters between consecutive characters in the pattern. As compared to prior techniques, the approach finds all maximum size biclusters with a number of conditions greater than a specified minimum. It has a run time equivalent to the fastest of these techniques, even though the fastest biclustering techniques are not guaranteed to find all biclusters.
Keywords
DNA; genetics; pattern clustering; pattern matching; DNA microarray data biclustering; approximate pattern matching technique; early pruning; gene expression data; genetic pathways; Clustering algorithms; DNA; Data analysis; Diseases; Displays; Gene expression; Genetic expression; Pattern matching; Performance loss;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8874-7
Type
conf
DOI
10.1109/ICASSP.2005.1416418
Filename
1416418
Link To Document