DocumentCode :
980322
Title :
RNA Search with Decision Trees and Partial Covariance Models
Author :
Smith, Jennifer A.
Author_Institution :
Electr. & Comput. Eng. Dept., Boise State Univ., Boise, ID, USA
Volume :
6
Issue :
3
fYear :
2009
Firstpage :
517
Lastpage :
527
Abstract :
The use of partial covariance models to search for RNA family members in genomic sequence databases is explored. The partial models are formed from contiguous subranges of the overall RNA family multiple alignment columns. A binary decision-tree framework is presented for choosing the order to apply the partial models and the score thresholds on which to make the decisions. The decision trees are chosen to minimize computation time subject to the constraint that all of the training sequences are passed to the full covariance model for final evaluation. Computational intelligence methods are suggested to select the decision tree since the tree can be quite complex and there is no obvious method to build the tree in these cases. Experimental results from seven RNA families shows execution times of 0.066-0.268 relative to using the full covariance model alone. Tests on the full sets of known sequences for each family show that at least 95 percent of these sequences are found for two families and 100 percent for five others. Since the full covariance model is run on all sequences accepted by the partial model decision tree, the false alarm rate is at least as low as that of the full model alone.
Keywords :
binary decision diagrams; bioinformatics; decision trees; macromolecules; molecular biophysics; organic compounds; search problems; RNA family multiple alignment column; RNA search; binary decision-tree framework; computational intelligence method; decision trees; false alarm rate; genomic sequence databases; partial covariance model; Bioinformatics; Bioinformatics (genome or protein) databases; Biology and genetics; RNA database search; RNA database search.; Sciences; computational intelligence; covariance models; decision trees; Base Sequence; Computational Biology; Data Interpretation, Statistical; Databases, Nucleic Acid; Decision Trees; Models, Genetic; Nucleic Acid Conformation; RNA; Sequence Alignment;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2008.120
Filename :
4668338
Link To Document :
بازگشت