Title :
Protein secondary structure pattern discovery and its application in secondary structure prediction
Author :
Li, Ming-Hui ; Wang, Xiao-long ; Lin, Lei ; Guan, Yi
Author_Institution :
Dept. of Comput. Sci. & Technol., Harbin Inst. of Technol., China
Abstract :
A method of protein secondary structure pattern discovery is presented. The TEIRESIAS algorithm has been improved to discover protein secondary structure patterns. Four protein secondary structure pattern dictionaries have been built for four organisms. The distribution of patterns and common patterns´ structure in different dictionaries is different. Different organism´s proteins represent different biological language. Based on the organism-specific dictionary, a hidden Markov model is built to predict proteins secondary structure. Dictionary-based prediction has been tested on four organisms and compared with the profile network from HeiDelberg (PHD) method. The experimental results show that our predict method is better than the PHD method for modified segment overlap (SOV) assessment.
Keywords :
biology computing; data mining; hidden Markov models; microorganisms; pattern recognition; proteins; TEIRESIAS algorithm; biological language; dictionary based prediction; hidden Markov model; organism specific dictionary; pattern distribution; profile network from HeiDelberg method; protein secondary structure pattern dictionary; protein secondary structure pattern discovery; protein secondary structure prediction; segment overlap assessment; Amino acids; Application software; Computer science; Databases; Dictionaries; Hidden Markov models; Organisms; Proteins; Sequences; Solids;
Conference_Titel :
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN :
0-7803-8403-2
DOI :
10.1109/ICMLC.2004.1381999