Title :
Generate, test, and explain: synthesizing regularity exposing attributes in large protein databases
Author :
De La Maza, Michael
Author_Institution :
Artificial Intelligence Lab., MIT, Cambridge, MA, USA
Abstract :
Describes a database mining system that synthesizes regularity-exposing attributes in large protein databases. After processing the primary and secondary structure data, this system discovers an amino acid representation that captures what are thought to be the three most important amino acid characteristics (size, charge, and hydrophobicity) for tertiary structure prediction. A neural network trained using this 16-bit representation achieves a performance accuracy on the secondary structure prediction problem that is comparable to the one achieved by a neural network trained using the standard 24-bit amino acid representation.<>
Keywords :
biology computing; explanation; macromolecular configurations; neural nets; proteins; very large databases; 16-bit representation; amino acid representation; charge; database mining system; hydrophobicity; large protein databases; neural network training; performance accuracy; primary structure data processing; regularity-exposing attribute synthesis; secondary structure prediction; size; tertiary structure prediction;
Conference_Titel :
System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on
Conference_Location :
Wailea, HI, USA
Print_ISBN :
0-8186-5090-7
DOI :
10.1109/HICSS.1994.323559