DocumentCode :
2379006
Title :
Simple but effective methods for combining kernels in computational biology
Author :
Tanabe, Hiroaki ; Ho, Tu Bao ; Nguyen, Canh Hao ; Kawasaki, Saori
Author_Institution :
Sch. of Knowledge Sci., Japan Adv. Inst. of Sci. & Technol., Ishikawa
fYear :
2008
fDate :
13-17 July 2008
Firstpage :
71
Lastpage :
78
Abstract :
Complex biological data generated from various experiments are stored in diverse data types in multiple datasets. By appropriately representing each biological dataset as a kernel matrix then combining them in solving problems, the kernel-based approach has become a spotlight in data integration and its application in bioinformatics and other fields as well. While linear combination of unweighed multiple kernels (UMK) is popular, there have been effort on multiple kernel learning (MKL) where optimal weights are learned by semi-definite programming or sequential minimal optimization (SMO-MKL). These methods provide high accuracy of biological prediction problems, but very complicated and hard to use, especially for non-experts in optimization. These methods are also usually of high computational cost and not suitable for large data sets. In this paper, we propose two simple but effective methods for determining weights for conic combination of multiple kernels. The former is to learn optimal weights formulated by our measure FSM for kernel matrix evaluation (feature space-based kernel matrix evaluation measure), denoted by FSM-MKL. The latter assigns a weight to each kernel that is proportional to the quality of the kernel, determining by direct cross validation, named proportionally weighted multiple kernels (PWMK). Experimental comparative evaluation of the four methods UMK, SMO-MKL, FSM-MKL and PWMK for the problem of protein-protein interactions shows that our proposed methods are simpler, more efficient but still effective. They achieved performances almost as high as that of MKL and higher than that of UMK.
Keywords :
biology computing; data analysis; finite state machines; proteins; FSM; bioinformatics; complex biological data; computational biology; data integration; kernel matrix evaluation; protein-protein interactions; unweighed multiple kernels; Bioinformatics; Biology computing; Computational biology; Fungi; Gene expression; Kernel; Proteins; Pulse width modulation; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on
Conference_Location :
Ho Chi Minh City
Print_ISBN :
978-1-4244-2379-8
Electronic_ISBN :
978-1-4244-2380-4
Type :
conf
DOI :
10.1109/RIVF.2008.4586335
Filename :
4586335
Link To Document :
بازگشت