Title :
A Framework for Multiple Kernel Support Vector Regression and Its Applications to siRNA Efficacy Prediction
Author :
Qiu, Shibin ; Lane, Terran
Author_Institution :
Pathwork Diagnostics, Inc., Sunnyvale, CA
Abstract :
The cell defense mechanism of RNA interference has applications in gene function analysis and promising potentials in human disease therapy. To effectively silence a target gene, it is desirable to select appropriate initiator siRNA molecules having satisfactory silencing capabilities. Computational prediction for silencing efficacy of siRNAs can assist this screening process before using them in biological experiments. String kernel functions, which operate directly on the string objects representing siRNAs and target mRNAs, have been applied to support vector regression for the prediction and improved accuracy over numerical kernels in multidimensional vector spaces constructed from descriptors of siRNA design rules. To fully utilize information provided by string and numerical data, we propose to unify the two in a kernel feature space by devising a multiple kernel regression framework where a linear combination of the kernels is used. We formulate the multiple kernel learning into a quadratically constrained quadratic programming (QCQP) problem, which although yields global optimal solution, is computationally demanding and requires a commercial solver package. We further propose three heuristics based on the principle of kernel-target alignment and predictive accuracy. Empirical results demonstrate that multiple kernel regression can improve accuracy, decrease model complexity by reducing the number of support vectors, and speed up computational performance dramatically. In addition, multiple kernel regression evaluates the importance of constituent kernels, which for the siRNA efficacy prediction problem, compares the relative significance of the design rules. Finally, we give insights into the multiple kernel regression mechanism and point out possible extensions.
Keywords :
bioinformatics; macromolecules; molecular biophysics; organic compounds; support vector machines; RNA interference; cell defense mechanism; gene function analysis; human disease therapy; kernel-target alignment; multiple kernel learning; multiple kernel support vector regression; quadratically constrained quadratic programming; siRNA efficacy prediction; siRNA molecules; silencing efficacy; Multiple kernel learning; QCQP optimization; RNA interference; multiple kernel heuristics; siRNA efficacy.; support vector regression; Algorithms; Animals; Artificial Intelligence; Humans; Models, Genetic; Models, Statistical; RNA Interference; RNA, Small Interfering; Regression Analysis; Sequence Alignment;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2008.139