DocumentCode :
3410445
Title :
RNA motif search using the structure to string (STR2) method
Author :
Bergig, Oriel ; Barash, Danny ; Kedem, Klara
Author_Institution :
Ben-Gurion Univ., Beer-Sheva, Israel
fYear :
2004
fDate :
16-19 Aug. 2004
Firstpage :
660
Lastpage :
661
Abstract :
We present a novel approach for detecting RNA shapes in given selected genes. Aside of the traditional sequence-based search methods such as BLAST and FASTA, there is a growing interest in detecting specific RNA secondary structure domains by using effective structure-based search methods such as the RNAMotif. Towards this end, we devise a new algorithm with ideas taken from computational geometry. The method, called structure to string (STR2), was initially developed to detect structural motifs in the tertiary structure of proteins. It converts an RNA secondary structure into a shape representing string of characters that capture the various structural motifs. To transform an RNA secondary structure to a string of characters, we adopt an approach used in proteomics for generating a collection of fragments. We identify a library of fragments for use in RNA secondary structure where each fragment is represented by a character. A unique feature of our method is that the fragments represent the geometry of the transitions between the secondary structure elements, such as the curve of the transition between stems and loops. Consequently, we represent the secondary structures of the query and target sequences by their corresponding character string representation and seek shape similarities by applying string matching algorithms. For the RNA folding prediction we use mfold. The method is implemented efficiently using suffix trees and other economization procedures. We show examples of its applicability on aptamer domains that are functionally important and are well predicted by mfold before the conversion to strings.
Keywords :
biology computing; computational geometry; macromolecules; molecular biophysics; proteins; string matching; trees (mathematics); RNA folding prediction; RNA motif search; RNA secondary structure domains; RNA shape detection; RNAMotif; aptamer domains; character string representation; computational geometry; economization; effective structure-based search methods; genes; mfold; proteomics; query sequences; shape similarities; string matching algorithms; structure to string method; suffix trees; target sequences; Character generation; Computational geometry; Computer science; Crops; Libraries; Protein engineering; Proteomics; RNA; Search methods; Shape;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN :
0-7695-2194-0
Type :
conf
DOI :
10.1109/CSB.2004.1332536
Filename :
1332536
Link To Document :
بازگشت