Author :
Yang, Bingru ; Wang, Lijun ; Zhai, Yun ; Qu, Wu
Author_Institution :
Sch. of Inf. Eng., Univ. of Sci. & Technol., Beijing, China
Abstract :
Protein Secondary structure prediction is essential for the tertiary structure modeling, and it is the one of the major challenge of bioinformatics. In this article, we proposed a gradually enhanced, multi-layered prediction systematic model, Compound Pyramid Model (CPM). This model is composed of four independent coordination´s layers by intelligent interfaces, synthesizes several methods, such as SVM, KDD* process model and so on. The model penetrates the whole domain knowledge, and the effective attributes are chosen by Causal Cellular Automata, and the high pure structure database is constructed for training. On the RS126 data set, state overall per-residue accuracy, Q3, reached 83.99%. On the CB513 data set, Q3 reached 85.58%. Meanwhile, on the CASP8´s sequences, the results are found to be superior to those produced by other methods, such as SAM, PSI-Blast, Prospect, JUFO, and so on. The result shows that our method has strong generalization ability.
Keywords :
bioinformatics; cellular automata; proteins; support vector machines; KDD process model; SVM model; bioinformatics; causal cellular automata; compound pyramid model; domain knowledge; protein secondary structure prediction; structure database; support vector machines; tertiary structure modeling; Accuracy; Amino acids; Compounds; Hidden Markov models; Predictive models; Proteins; Support vector machines; Compound Pyramid Model; Data Mining; Hybrid Prediction Model; Protein secondary structure Prediction;