Title :
Design of an NGS MicroRNA predictor using multilayer hierarchical MapReduce framework
Author :
Ren-Hao Pan;Lin-Yu Tseng;I-En Liao;Chien-Lung Chan;K. Robert Lai;Kai-Biao Lin
Author_Institution :
Innovation Center for Big data and Digital Convergence, Yuan Ze University, Taoyuan 320, Taiwan
Abstract :
MicroRNAs (miRNAs) are a group of small noncoding RNA (ncRNA) molecules that play an important role in biological functions. This paper proposes a microRNA prediction application, which is based on the multi-layer hierarchical MapReduce framework and provides four prediction workflows for four different datasets: miRNA-like sequences, miRNA cluster sequences, unknown miRNA sequences and the next generation sequencing (NGS) sequences. These workflows include four core procedures for finding the genome location, applying the biological filtering criteria and a genetic algorithm based pre-miRNA classifier. Each procedure works as a MapReduce task and uses JSON format to translate the MapReduce output to the next MapReduce procedure. Experimental results show that the proposed miRNA predictor not only achieves high sensitivity and accuracy, but also has ability to process more than million sequences in acceptable time by relying on the multi-layer hierarchical MapReduce framework.
Keywords :
"RNA","Bioinformatics","Genetic algorithms","Genomics","Sequential analysis","Proteins"
Conference_Titel :
Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on
Print_ISBN :
978-1-4673-8272-4
DOI :
10.1109/DSAA.2015.7344862