Title :
Screening Data for Phylogenetic Analysis of Land Plants: A Parallel Approach
Author :
Liu Yong ; Meng Zhen ; Liu Qi ; Gao Yanping ; Zhou Yuanchun ; Li Jianhui
Author_Institution :
Sci. Data Center, CAS, Beijing, China
Abstract :
Screening data for phylogenetic analysis from large datasets is a known computational problem of data-intensive application. In this paper, we implement an approach to screen sequence data for The Platform for Phylogenetic Analysis of Land Plants (PALPP), using the MapReduce paradigm to parallelize the Basic Local Alignment Search Tool (BLAST) and to manage its execution, using machine virtualization to encapsulate its execution environment and commonly using data sets into flexibly deployable virtual machines. Two methods of BLAST using Hadoop are implemented and the evaluation of the approach is also presented.
Keywords :
botany; data analysis; distributed processing; evolution (biological); vegetation; virtual machines; Hadoop; MapReduce; basic local alignment search tool; data screening; land plants; phylogenetic analysis; virtual machines; Bioinformatics; Data mining; Distributed databases; File systems; Phylogeny; Programming; BLAST; Data screening; Hadoop; MapReduce; PALPP;
Conference_Titel :
Networking and Distributed Computing (ICNDC), 2010 First International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4244-8382-2
DOI :
10.1109/ICNDC.2010.66