• DocumentCode
    3233402
  • Title

    Screening Data for Phylogenetic Analysis of Land Plants: A Parallel Approach

  • Author

    Liu Yong ; Meng Zhen ; Liu Qi ; Gao Yanping ; Zhou Yuanchun ; Li Jianhui

  • Author_Institution
    Sci. Data Center, CAS, Beijing, China
  • fYear
    2010
  • fDate
    21-24 Oct. 2010
  • Firstpage
    305
  • Lastpage
    308
  • Abstract
    Screening data for phylogenetic analysis from large datasets is a known computational problem of data-intensive application. In this paper, we implement an approach to screen sequence data for The Platform for Phylogenetic Analysis of Land Plants (PALPP), using the MapReduce paradigm to parallelize the Basic Local Alignment Search Tool (BLAST) and to manage its execution, using machine virtualization to encapsulate its execution environment and commonly using data sets into flexibly deployable virtual machines. Two methods of BLAST using Hadoop are implemented and the evaluation of the approach is also presented.
  • Keywords
    botany; data analysis; distributed processing; evolution (biological); vegetation; virtual machines; Hadoop; MapReduce; basic local alignment search tool; data screening; land plants; phylogenetic analysis; virtual machines; Bioinformatics; Data mining; Distributed databases; File systems; Phylogeny; Programming; BLAST; Data screening; Hadoop; MapReduce; PALPP;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networking and Distributed Computing (ICNDC), 2010 First International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4244-8382-2
  • Type

    conf

  • DOI
    10.1109/ICNDC.2010.66
  • Filename
    5645374