DocumentCode :
1050437
Title :
Evolutionary Optimization of Kernel Weights Improves Protein Complex Comembership Prediction
Author :
Hulsman, Marc ; Reinders, Marcel J T ; De Ridder, Dick
Author_Institution :
Inf. & Commun. Theor. Group, Delft Univ. of Technol., Delft, Netherlands
Volume :
6
Issue :
3
fYear :
2009
Firstpage :
427
Lastpage :
437
Abstract :
In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data sources can be challenging. Recently, it has been recognized that kernel-based classifiers are well suited for this task. However, the different kernels (data sources) are often combined using equal weights. Although several methods have been developed to optimize kernel weights, no large-scale example of an improvement in classifier performance has been shown yet. In this work, we employ an evolutionary algorithm to determine weights for a larger set of kernels by optimizing a criterion based on the area under the ROC curve. We show that setting the right kernel weights can indeed improve performance. We compare this to the existing kernel weight optimization methods (i.e., (regularized) optimization of the SVM criterion or aligning the kernel with an ideal kernel) and find that these do not result in a significant performance improvement and can even cause a decrease in performance. Results also show that an expert approach of assigning high weights to features with high individual performance is not necessarily the best strategy.
Keywords :
biology computing; evolutionary computation; optimisation; pattern classification; proteomics; area under ROC curve; classifier performance improvement; data source integration; evolutionary algorithm; high throughput data sources; kernel based classifiers; kernel weight evolutionary optimization; protein complex comembership prediction; Bioinformatics; Biology computing; Evolutionary computation; Kernel; Large-scale systems; Optimization methods; Protein engineering; Sequences; Support vector machine classification; Support vector machines; Classifier design and evaluation; biology and genetics; evolutionary computing and genetic algorithms.; Algorithms; Artificial Intelligence; Evolution, Molecular; Linear Models; Models, Genetic; Multiprotein Complexes; Nonlinear Dynamics; ROC Curve; Reproducibility of Results;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2008.137
Filename :
4731237
Link To Document :
بازگشت