Title :
Hadoop Applications in Bioinformatics
Author :
Li Xubin ; Jiang Wenrui ; Jiang Yi ; Zou Quan
Author_Institution :
Sch. of Inf. Sci. & Technol., Xiamen Univ., Xiamen, China
Abstract :
Bioinformatics is in a dilemma that traditional analysis tools work hard on the large-scale data from the high-throughout sequencing. In recent years, the open source Apache Hadoop project, which adopts MapReduce framework and distributed file system, brings bioinformatics researchers opportunities to obtain a scalable, efficient and reliable computing performance on Linux clusters and Cloud Computing Service. In this paper, we present Hadoop-based applications employed in bioinformatics, covering next-generation sequencing and other biological domains. In addition, we discuss obstacles and future works about Hadoop in bioinformatics.
Keywords :
Linux; bioinformatics; cloud computing; distributed databases; parallel programming; public domain software; Linux clusters; MapReduce framework; bioinformatics; biological domains; cloud computing service; distributed file system; high-throughout sequencing; large-scale data; next-generation sequencing; open source Apache Hadoop project; Assembly; Bioinformatics; Cloud computing; Genomics; Graphics processing units; Sequential analysis; Bioinformatics; Cloud; Hadoop; MapReduce; Next-generation sequencing;
Conference_Titel :
Open Cirrus Summit (OCS), 2012 Seventh
Conference_Location :
Beijing
DOI :
10.1109/OCS.2012.40