• DocumentCode
    3410445
  • Title

    RNA motif search using the structure to string (STR2) method

  • Author

    Bergig, Oriel ; Barash, Danny ; Kedem, Klara

  • Author_Institution
    Ben-Gurion Univ., Beer-Sheva, Israel
  • fYear
    2004
  • fDate
    16-19 Aug. 2004
  • Firstpage
    660
  • Lastpage
    661
  • Abstract
    We present a novel approach for detecting RNA shapes in given selected genes. Aside of the traditional sequence-based search methods such as BLAST and FASTA, there is a growing interest in detecting specific RNA secondary structure domains by using effective structure-based search methods such as the RNAMotif. Towards this end, we devise a new algorithm with ideas taken from computational geometry. The method, called structure to string (STR2), was initially developed to detect structural motifs in the tertiary structure of proteins. It converts an RNA secondary structure into a shape representing string of characters that capture the various structural motifs. To transform an RNA secondary structure to a string of characters, we adopt an approach used in proteomics for generating a collection of fragments. We identify a library of fragments for use in RNA secondary structure where each fragment is represented by a character. A unique feature of our method is that the fragments represent the geometry of the transitions between the secondary structure elements, such as the curve of the transition between stems and loops. Consequently, we represent the secondary structures of the query and target sequences by their corresponding character string representation and seek shape similarities by applying string matching algorithms. For the RNA folding prediction we use mfold. The method is implemented efficiently using suffix trees and other economization procedures. We show examples of its applicability on aptamer domains that are functionally important and are well predicted by mfold before the conversion to strings.
  • Keywords
    biology computing; computational geometry; macromolecules; molecular biophysics; proteins; string matching; trees (mathematics); RNA folding prediction; RNA motif search; RNA secondary structure domains; RNA shape detection; RNAMotif; aptamer domains; character string representation; computational geometry; economization; effective structure-based search methods; genes; mfold; proteomics; query sequences; shape similarities; string matching algorithms; structure to string method; suffix trees; target sequences; Character generation; Computational geometry; Computer science; Crops; Libraries; Protein engineering; Proteomics; RNA; Search methods; Shape;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332536
  • Filename
    1332536