DocumentCode
2991921
Title
A MapReduce-based Algorithm for Motif Search
Author
Huo, Hongwei ; Lin, Shuai ; Yu, Qiang ; Zhang, Yipu ; Stojkovic, Vojislav
Author_Institution
Sch. of Comput. Sci. & Technol., Xidian Univ., Xi´´an, China
fYear
2012
fDate
21-25 May 2012
Firstpage
2052
Lastpage
2060
Abstract
Motif search plays an important role in gene finding and understanding gene regulation relationship. Motif search is one of the most challenging problems in bioinformatics. In this paper, we present three data partitions for the PMSP algorithm and propose the PMSP MapReduce algorithm (PMSPMR) for solving the motif search problem. For instances of the problem with different difficulties, the experimental results on the Hadoop cluster demonstrate that PMSPMR has good scalability. In particular, for the more difficult motif search problems, PMSPMR shows its advantage because the speedup is almost linearly proportional to the number of nodes in the Hadoop cluster. We also present experimental results on realistic biological data by identifying known transcriptional regulatory motifs in eukaryotes as well as in actual promoter sequences extracted from Saccharomyces cerevisiae.
Keywords
bioinformatics; genetics; search problems; Hadoop cluster; PMSP MapReduce algorithm; PMSPMR; Saccharomyces cerevisiae; bioinformatics; data partitions; gene finding; gene regulation relationship; motif search; Algorithm design and analysis; Clustering algorithms; DNA; Hamming distance; Partitioning algorithms; Search problems; Hadoop; MapReduce; Motif search; data partition; scalability;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location
Shanghai
Print_ISBN
978-1-4673-0974-5
Type
conf
DOI
10.1109/IPDPSW.2012.255
Filename
6270414
Link To Document