DocumentCode
3605087
Title
SNAPR: A Bioinformatics Pipeline for Efficient and Accurate RNA-Seq Alignment and Analysis
Author
Magis, Andrew T. ; Funk, Cory C. ; Price, Nathan D.
Author_Institution
Inst. for Syst. Biol., Seattle, WA, USA
Volume
1
Issue
2
fYear
2015
Firstpage
22
Lastpage
25
Abstract
The process of converting raw RNA sequencing (RNA-seq) data to interpretable results can be circuitous and time-consuming, requiring multiple steps. We present an RNA-seq mapping algorithm that streamlines this process. Our algorithm utilizes a hash table approach to leverage the availability and the power of high memory machines. SNAPR, which can be run on a single library or thousands of libraries, can take compressed or uncompressed FASTQ and BAM files, and output a sorted BAM file, individual read counts, and gene fusions, and can identify exogenous RNA species in a single step. SNAPR also does native Phred score filtering of reads. SNAPR is also well suited for future sequencing platforms that generate longer reads. We show how we can analyze data from hundreds of TCGA samples in a matter of hours while identifying gene fusions and viral events at the same time. With the reference genome and transcriptome undergoing periodic updates and the need for uniform parameters when integrating multiple data sets, there is great need for a streamlined process for RNA-seq analysis. We demonstrate how SNAPR does this efficiently and accurately.
Keywords
RNA; bioinformatics; genomics; BAM file; FASTQ file; RNA sequencing; RNA-seq alignment; RNA-seq analysis; RNA-seq data process; RNA-seq mapping; SNAPR; bioinformatics pipeline; exogenous RNA species; gene fusion; high memory machine; periodic update; reference genome; single library; transcriptome; viral event; Algorithm design and analysis; Bioinformatics; Cancer; Databases; Genomics; RNA; Sequential analysis; Bioinformatics; RNA; biology; biology computing; computational biology; genetic expression;
fLanguage
English
Journal_Title
Life Sciences Letters, IEEE
Publisher
ieee
ISSN
2332-7685
Type
jour
DOI
10.1109/LLS.2015.2465870
Filename
7229277
Link To Document