DocumentCode :
2773874
Title :
Nonsmooth Bilevel Programming for Hyperparameter Selection
Author :
Moore, Gregory M. ; Bergeron, Charles ; Bennett, Kristin P.
Author_Institution :
Dept. of Math. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
fYear :
2009
fDate :
6-6 Dec. 2009
Firstpage :
374
Lastpage :
381
Abstract :
We propose a nonsmooth bilevel programming method for training linear learning models with hyperparameters optimized via T-fold cross-validation (CV). This algorithm scales well in the sample size. The method handles loss functions with embedded maxima such as in support vector machines. Current practice constructs models over a predefined grid of hyperparameter combinations and selects the best one, an inefficient heuristic. Innovating over previous bilevel CV approaches, this paper represents an advance towards the goal of self-tuning supervised data mining as well as a significant innovation in scalable bilevel programming algorithms. Using the bilevel CV formulation, the lower-level problems are treated as unconstrained optimization problems and are replaced with their optimality conditions. The resulting nonlinear program is nonsmooth and nonconvex. We develop a novel bilevel programming algorithm to solve this class of problems, and apply it to linear least-squares support vector regression having hyperparameters C (tradeoff) and e (loss insensitivity). This new approach outperforms grid search and prior smooth bilevel CV methods in terms of modeling performance. Increased speed foresees modeling with an increased number of hyperparameters.
Keywords :
data mining; learning (artificial intelligence); optimisation; regression analysis; support vector machines; T-fold cross-validation; hyperparameter combinations grid; linear learning training models; linear least-squares support vector regression; loss functions; nonlinear program; nonsmooth bilevel programming; self-tuning supervised data mining; support vector machines; unconstrained optimization problems; Cloud computing; Clustering algorithms; Computer networks; Conferences; Costs; Data mining; Data processing; Decision trees; Machine learning algorithms; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-5384-9
Electronic_ISBN :
978-0-7695-3902-7
Type :
conf
DOI :
10.1109/ICDMW.2009.74
Filename :
5360434
Link To Document :
بازگشت