• DocumentCode
    2991921
  • Title

    A MapReduce-based Algorithm for Motif Search

  • Author

    Huo, Hongwei ; Lin, Shuai ; Yu, Qiang ; Zhang, Yipu ; Stojkovic, Vojislav

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Xidian Univ., Xi´´an, China
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    2052
  • Lastpage
    2060
  • Abstract
    Motif search plays an important role in gene finding and understanding gene regulation relationship. Motif search is one of the most challenging problems in bioinformatics. In this paper, we present three data partitions for the PMSP algorithm and propose the PMSP MapReduce algorithm (PMSPMR) for solving the motif search problem. For instances of the problem with different difficulties, the experimental results on the Hadoop cluster demonstrate that PMSPMR has good scalability. In particular, for the more difficult motif search problems, PMSPMR shows its advantage because the speedup is almost linearly proportional to the number of nodes in the Hadoop cluster. We also present experimental results on realistic biological data by identifying known transcriptional regulatory motifs in eukaryotes as well as in actual promoter sequences extracted from Saccharomyces cerevisiae.
  • Keywords
    bioinformatics; genetics; search problems; Hadoop cluster; PMSP MapReduce algorithm; PMSPMR; Saccharomyces cerevisiae; bioinformatics; data partitions; gene finding; gene regulation relationship; motif search; Algorithm design and analysis; Clustering algorithms; DNA; Hamming distance; Partitioning algorithms; Search problems; Hadoop; MapReduce; Motif search; data partition; scalability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4673-0974-5
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2012.255
  • Filename
    6270414