Title :
Computational Discovery of Motifs Using Hierarchical Clustering Techniques
Author :
Wang, Dianhui ; Lee, Nung Kion
Author_Institution :
Dept. of Comput. Sci. & Comput. Eng., La Trobe Univ., Melbourne, VIC
Abstract :
Discovery of motifs plays a key role in understanding gene regulation in organisms. Existing tools for motif discovery demonstrate some weaknesses in dealing with reliability and scalability. Therefore, development of advanced algorithms for resolving this problem will be useful. This paper aims to develop data mining techniques for discovering motifs. A mismatch based hierarchical clustering algorithm is proposed in this paper, where three heuristic rules for classifying clusters and a post-processing for ranking and refining the clusters are employed in the algorithm. Our algorithm is evaluated using two sets of DNA sequences with comparisons. Results demonstrate that the proposed techniques in this paper outperform MEME, AlignACE and SOMBRERO for most of the testing datasets.
Keywords :
bioinformatics; data mining; genetics; pattern classification; pattern clustering; data mining; gene regulation; heuristic rule; mismatch based hierarchical clustering algorithm; motif discovery; pattern classification; Clustering algorithms; Computer science; DNA; Data engineering; Data mining; Frequency; Gene expression; Organisms; Reliability engineering; Sequences;
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3502-9
DOI :
10.1109/ICDM.2008.21