Title :
Protein Attributes Microtuning System (PAMS): an effective tool to increase protein structure prediction by data purification
Author :
Zhang, Fan ; Povey, David ; Krause, Paul
Author_Institution :
Surrey Univ., Guildford
Abstract :
Given the expense of more direct determinations, using machine-learning schemes to predict a protein secondary structure from the sequence alone remains an important methodology. To achieve significant improvements in prediction accuracy, the authors have developed an automated tool to prepare very large biological datasets, to be used by the learning network. By focusing on improvements in data quality and validation, our experiments yielded a highest prediction accuracy of protein secondary structure of 90.97%. An important additional aspect of this achievement is that the predictions are based on a template-free statistical modeling mechanism. The performance of each different classifier is also evaluated and discussed. In this paper a protein set of 232 protein chains are proposed to be used in the prediction. Our goal is to make the tools discussed available as services in part of a digital ecosystem that supports knowledge sharing amongst the protein structure prediction community.
Keywords :
biology computing; data mining; learning (artificial intelligence); pattern classification; proteins; sequences; statistical analysis; very large databases; data purification; instance-based classifier; knowledge sharing; machine-learning scheme; protein attributes microtuning system; protein secondary structure prediction; protein sequence; template-free statistical modeling mechanism; very large biological datasets; Accuracy; Alzheimer´s disease; Chemicals; Chemistry; Ecosystems; Pharmaceuticals; Predictive models; Protein engineering; Purification; Shape; Automata; Biomedical Computing; Data Management; Prediction Methods; Proteins;
Conference_Titel :
Digital EcoSystems and Technologies Conference, 2007. DEST '07. Inaugural IEEE-IES
Conference_Location :
Cairns
Print_ISBN :
1-4244-0470-3
Electronic_ISBN :
1-4244-0470-3
DOI :
10.1109/DEST.2007.372039