DocumentCode
3189033
Title
Characterizing RNA Secondary-Structure Features and Their Effects on Splice-Site Prediction
Author
Dogan, Rezarta Islamaj ; Getoor, Lise ; Wilbur, W. John
Author_Institution
Univ. of Maryland, College Park
fYear
2007
fDate
28-31 Oct. 2007
Firstpage
89
Lastpage
94
Abstract
RNA molecules are distinguished by their sequence composition and by their three-dimensional shape, called the secondary structure. The secondary structure of a pre-mRNA sequence may have a strong influence on gene splicing. In our previous work, we showed that a splice-site model employing sequence features built using our feature generation algorithm was very effective in predicting splice sites. The generated sequence features also contained biologically relevant features. In this paper, we extend the feature generation algorithm to construct secondary-structure features. These features capture the nucleotide pairing tendency in the splice-site neighborhood. We extend the splice-site model to include both pre-mRNA sequence and structure characteristics. The new model significantly outperforms the sequence-based features model. The identified secondary-structure features capture biologically relevant signals such as splicing silencers. We also found these signals to prefer specific regions around the splice-site neighborhood and we detail their preference.
Keywords
biology computing; genetics; macromolecules; molecular biophysics; molecular configurations; organic compounds; sequences; RNA molecules; RNA secondary-structure features; feature generation algorithm; gene splicing; nucleotide pairing tendency; pre-mRNA sequence composition; splice-site model; splice-site neighborhood; three-dimensional shape; Biological system modeling; Computer science; DNA; Data mining; Educational institutions; Predictive models; Proteins; RNA; Sequences; Splicing;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on
Conference_Location
Omaha, NE
Print_ISBN
978-0-7695-3019-2
Electronic_ISBN
978-0-7695-3033-8
Type
conf
DOI
10.1109/ICDMW.2007.119
Filename
4476651
Link To Document