DocumentCode
3233402
Title
Screening Data for Phylogenetic Analysis of Land Plants: A Parallel Approach
Author
Liu Yong ; Meng Zhen ; Liu Qi ; Gao Yanping ; Zhou Yuanchun ; Li Jianhui
Author_Institution
Sci. Data Center, CAS, Beijing, China
fYear
2010
fDate
21-24 Oct. 2010
Firstpage
305
Lastpage
308
Abstract
Screening data for phylogenetic analysis from large datasets is a known computational problem of data-intensive application. In this paper, we implement an approach to screen sequence data for The Platform for Phylogenetic Analysis of Land Plants (PALPP), using the MapReduce paradigm to parallelize the Basic Local Alignment Search Tool (BLAST) and to manage its execution, using machine virtualization to encapsulate its execution environment and commonly using data sets into flexibly deployable virtual machines. Two methods of BLAST using Hadoop are implemented and the evaluation of the approach is also presented.
Keywords
botany; data analysis; distributed processing; evolution (biological); vegetation; virtual machines; Hadoop; MapReduce; basic local alignment search tool; data screening; land plants; phylogenetic analysis; virtual machines; Bioinformatics; Data mining; Distributed databases; File systems; Phylogeny; Programming; BLAST; Data screening; Hadoop; MapReduce; PALPP;
fLanguage
English
Publisher
ieee
Conference_Titel
Networking and Distributed Computing (ICNDC), 2010 First International Conference on
Conference_Location
Hangzhou
Print_ISBN
978-1-4244-8382-2
Type
conf
DOI
10.1109/ICNDC.2010.66
Filename
5645374
Link To Document