Title :
Malphite: A convolutional neural network and ensemble learning based protein secondary structure predictor
Author :
Yang Li;Tetsuo Shibuya
Author_Institution :
Department of Computer Science, Graduate School of Information Science and Technology, University of Tokyo, Japan
Abstract :
We developed a convolution neural networks (CNN) and ensemble learning based method, called Malphite, to predict protein secondary structures. Maphite has three sub-models: the 1st CNN, PSI-PRED and the 2nd CNN. The 1st CNN and PSI-PRED are used to predict the initial secondary structure based on the position specific scoring matrix generated from PSIBLAST. The 2nd CNN performs ensemble learning by combining the prediction result of the 1st CNN and PSI-PRED and generate the final predictions. Malphite achieved a Q3 score of 82.3% and 82.6% for independently built dataset of 400 and 538 proteins respectively, and 82.6% ten-fold-cross validated accuracy for a dataset of 3000 proteins. In addition, Malphite accomplished a remarkable Q3 score of 83.6% for 122 targets from CASP10 (Critical Assessment of protein Structure Prediction), surpassing any secondary structure prediction technique to date. For all four datasets, Malphite consistently makes 2% more accurate prediction than PSI-PRED, which is a significantly step towards the estimated upper limit of protein secondary structure prediction accuracy of 90%.
Keywords :
"Proteins","Genomics","Bioinformatics"
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
DOI :
10.1109/BIBM.2015.7359861