• DocumentCode
    3138801
  • Title

    Evaluation of RNA Secondary Structure Motifs using Regression Analysis

  • Author

    Anwar, Mohammad ; Turcotte, Marcel

  • Author_Institution
    Sch. of Inf. Technol. & Eng., Ottawa Univ., Ont.
  • fYear
    2006
  • fDate
    38838
  • Firstpage
    1747
  • Lastpage
    1752
  • Abstract
    Recent experimental evidences have shown that ribonucleic acid (RNA) plays a greater role in the cell than previously thought. An ensemble of RNA sequences believed to contain signals at the structure level can be exploited to detect functional motifs common to all or a portion of those sequences. We present here a general framework for analyzing multiple RNA secondary structures. A family of related RNA structures may be analyzed using statistical regression methods. In this work, we extend our previously developed algorithm, seed, that allows to explore exhaustively the search space of RNA sequence and structure motifs. We introduce here several objective functions based on thermodynamic free energy and information content to discriminate native folds from the rest. We assume that the variation across the various scores can be represented by a statistical model. Regression analysis permits to assign separate weight for each score, allowing one to emphasize or compensate the variance that differs across the different scores. A statistical model can be formulated using techniques from regression analysis to obtain a template or scoring model that is able to identify putative functional regions in RNA sequences. We show that thermodynamic based regression models are effective to associate the variation of scores obtained from different functions. The models can generally identify motifs with high measures of specificity and positive predicted value to known motifs. A good scoring method will allow to eliminate invalid motifs thereby reducing the size of the hypothesis space
  • Keywords
    biology computing; cellular biophysics; molecular biophysics; organic compounds; regression analysis; RNA secondary structure motif; RNA sequence; putative functional region; ribonucleic acid; search space; statistical regression method; thermodynamic free energy; Accuracy; Biological information theory; Biology computing; Genetics; Information technology; Nearest neighbor searches; RNA; Regression analysis; Sequences; Thermodynamics; Motif discovery; linear regression; ribonucleic acid; secondary structure; thermodynamics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical and Computer Engineering, 2006. CCECE '06. Canadian Conference on
  • Conference_Location
    Ottawa, Ont.
  • Print_ISBN
    1-4244-0038-4
  • Electronic_ISBN
    1-4244-0038-4
  • Type

    conf

  • DOI
    10.1109/CCECE.2006.277314
  • Filename
    4054784