Title :
Protein secondary structure prediction with ICA feature extraction
Author :
Melo, J.C.B. ; Cavalcanti, George D C ; Guimarães, Katia S.
Author_Institution :
Center of Informatics, Federal Univ. of Pernambuco, Recife, Brazil
Abstract :
An original application of the independent component analysis (ICA) is presented in this work. This linear transformation method is used for feature extraction for a machine learning approach to the protein secondary structure prediction problem. PSI-blast profiles, built on NCBI´s nonredundant protein database, have their dimensionality reduced through ICA method. The resulting components are used as input data to three artificial neural networks with 30, 35 or 40 nodes in the hidden layer. Those classifiers are trained with the RPROP algorithm and five rules are used for the combination of their outputs. The results achieved are compared with the best ones recently obtained in similar conditions, including experiments using principal component analysis (PCA) as feature extraction method, presenting the best result. The performance of each network individually achieved a Q3 accuracy of 74.1% on average, using only 120 independent components. When the networks are combined with the product rule the performance achieved is 75.2%. This result is overcome only when the raw data are informed to the networks, when an accuracy of 75.9% is achieved.
Keywords :
biology computing; feature extraction; independent component analysis; learning (artificial intelligence); molecular biophysics; neural nets; principal component analysis; proteins; artificial neural networks; feature extraction; independent component analysis; machine learning; principal component analysis; protein database; protein secondary structure prediction; Feature extraction; In vivo; Independent component analysis; Informatics; Machine learning; Neural networks; Physics; Principal component analysis; Proteins; Sequences;
Conference_Titel :
Neural Networks for Signal Processing, 2003. NNSP'03. 2003 IEEE 13th Workshop on
Print_ISBN :
0-7803-8177-7
DOI :
10.1109/NNSP.2003.1318000