• DocumentCode
    1514387
  • Title

    A Memory Efficient Method for Structure-Based RNA Multiple Alignment

  • Author

    DeBlasio, D. ; Bruand, J. ; Shaojie Zhang

  • Author_Institution
    Dept. of Electr. Eng. & Comput. Sci., Univ. of Central Florida, Orlando, FL, USA
  • Volume
    9
  • Issue
    1
  • fYear
    2012
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    Structure-based RNA multiple alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Existing tools for RNA multiple alignment first generate pairwise RNA structure alignments and then build the multiple alignment using only sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequence-structure alignment procedure to build a structure-based RNA multiple alignment from one sequence with known structure and a database of sequences from the same family. PMFastR also has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. The algorithm also provides a method to utilize a multicore environment. We present results on benchmark data sets from BRAliBase, which shows PMFastR performs comparably to other state-of-the-art programs. Finally, we regenerate 607 Rfam seed alignments and show that our automated process creates multiple alignments similar to the manually curated Rfam seed alignments. Thus, the techniques presented in this paper allow for the generation of multiple alignments using sequence-structure guidance, while limiting memory consumption. As a result, multiple alignments of long RNA sequences, such as 16S and 23S rRNAs, can easily be generated locally on a personal computer. The software and supplementary data are available at http://genome.ucf.edu/PMFastR.
  • Keywords
    macromolecules; microcomputers; molecular biophysics; organic compounds; 607 Rfam seed alignments; BRANBase; PMFastR; RNA sequence; automated process; benchmark data sets; memory consumption; memory efficient method; multicore environment; personal computer; sequence-structure alignment procedure; software data; structure-based RNA multiple alignment; supplementary data; Arrays; Bioinformatics; Databases; Dynamic programming; Instruction sets; Memory management; RNA; RNA multiple alignment; RNA secondary structure; RNA sequence-structure alignment; iterative alignment.; Algorithms; Computational Biology; Databases, Genetic; Nucleic Acid Conformation; RNA; Sequence Alignment; Sequence Analysis, RNA;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2011.86
  • Filename
    5765939