DocumentCode
1514387
Title
A Memory Efficient Method for Structure-Based RNA Multiple Alignment
Author
DeBlasio, D. ; Bruand, J. ; Shaojie Zhang
Author_Institution
Dept. of Electr. Eng. & Comput. Sci., Univ. of Central Florida, Orlando, FL, USA
Volume
9
Issue
1
fYear
2012
Firstpage
1
Lastpage
11
Abstract
Structure-based RNA multiple alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Existing tools for RNA multiple alignment first generate pairwise RNA structure alignments and then build the multiple alignment using only sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequence-structure alignment procedure to build a structure-based RNA multiple alignment from one sequence with known structure and a database of sequences from the same family. PMFastR also has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. The algorithm also provides a method to utilize a multicore environment. We present results on benchmark data sets from BRAliBase, which shows PMFastR performs comparably to other state-of-the-art programs. Finally, we regenerate 607 Rfam seed alignments and show that our automated process creates multiple alignments similar to the manually curated Rfam seed alignments. Thus, the techniques presented in this paper allow for the generation of multiple alignments using sequence-structure guidance, while limiting memory consumption. As a result, multiple alignments of long RNA sequences, such as 16S and 23S rRNAs, can easily be generated locally on a personal computer. The software and supplementary data are available at http://genome.ucf.edu/PMFastR.
Keywords
macromolecules; microcomputers; molecular biophysics; organic compounds; 607 Rfam seed alignments; BRANBase; PMFastR; RNA sequence; automated process; benchmark data sets; memory consumption; memory efficient method; multicore environment; personal computer; sequence-structure alignment procedure; software data; structure-based RNA multiple alignment; supplementary data; Arrays; Bioinformatics; Databases; Dynamic programming; Instruction sets; Memory management; RNA; RNA multiple alignment; RNA secondary structure; RNA sequence-structure alignment; iterative alignment.; Algorithms; Computational Biology; Databases, Genetic; Nucleic Acid Conformation; RNA; Sequence Alignment; Sequence Analysis, RNA;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2011.86
Filename
5765939
Link To Document