Title :
Mining a set of coregulated RNA sequences
Author_Institution :
Dept. of Comput. & Inf. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Abstract :
Post-transcriptional regulation, though less studied, is an important research topic in bioinformatics. In a set of post-transcriptionally coregulated RNAs, the basepair interactions can organize the molecules into domains and provide a framework for functional interactions. Their consensus motifs may represent the binding sites of RNA regulatory proteins. Unlike DNA motifs, RNA motifs are more conserved in structures than in sequences. Knowing the structural motifs can help us better understand the regulation activities. We propose a novel data mining approach to RNA secondary structure prediction. To demonstrate the performance of our new approach, we first tested it on the same data sets previously used and published in literature. Secondly, to show the flexibility of our new approach, we also tested it on a data set that contains pseudoknot motifs that most current systems cannot identify.
Keywords :
biology computing; data mining; learning (artificial intelligence); scientific information systems; very large databases; RNA motifs; RNA regulatory proteins; RNA secondary structure prediction; bioinformatics; coregulated RNA sequence mining; data mining; data sets; functional interactions; post-transcriptional regulation; pseudoknot motifs; supervised learning; DNA; Dynamic programming; Genetic algorithms; Information science; Prediction methods; Proteins; RNA; Sequences; Stochastic processes; Testing;
Conference_Titel :
Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-1754-4
DOI :
10.1109/ICDM.2002.1184014