Title :
Using logic for protein structure prediction
Author :
Muggleton, S. ; King, Ross D. ; Sternberg, Michael J E
Author_Institution :
Turing Inst., Glasgow, UK
Abstract :
The prediction of protein secondary structure from a primary sequence is one of the most important unsolved problems in molecular biology. This paper shows that the use of a machine learning algorithm (Golem) which allows relational descriptions leads to improved performance. Golem takes, as input, examples and background knowledge described as Prolog facts. It produces, as output, Prolog rules which are a generalisation of the examples. Golem was applied to learning secondary structure prediction rules for alpha domain type proteins (a subset of the Protein Data Bank rich in helical secondary structure and nearly devoid of beta sheet). Golem learned a small set of rules predicting which residues are part of α-helices based on their positional relationships and chemical and physical properties. This representations is more easily understood by molecular biologists. Performance of the learned rules was 81% (+/-2%)
Keywords :
biology computing; learning systems; logic programming; macromolecular configurations; molecular biophysics; physics computing; proteins; Golem; Prolog facts; Prolog rules; Protein Data Bank; alpha -helices; alpha domain type proteins; background knowledge; chemical properties; examples; helical secondary structure; logic programming; machine learning algorithm; molecular biology; performance; physical properties; positional relationships; primary sequence; protein structure prediction; relational descriptions; residues; Artificial intelligence; Biological information theory; Chemicals; Learning systems; Logic; Machine learning algorithms; Proteins; Sequences; Shape; Statistics;
Conference_Titel :
System Sciences, 1992. Proceedings of the Twenty-Fifth Hawaii International Conference on
Conference_Location :
Kauai, HI
Print_ISBN :
0-8186-2420-5
DOI :
10.1109/HICSS.1992.183221