مرکز منطقه ای اطلاع رساني علوم و فناوري - Continuous optimization of hyper-parameters

DocumentCode :

2259626

Title :

Continuous optimization of hyper-parameters

Author :

Bengio, Yoshua

Author_Institution :

Dept. d´´Inf. et Recherche Oper., Montreal Univ., Que., Canada

Volume :

fYear :

2000

fDate :

2000

Firstpage :

305

Abstract :

Many machine learning algorithms can be formulated as the minimization of a training criterion which involves a hyper-parameter. This hyper-parameter is usually chosen by trial and error with a model selection criterion. In this paper we present a methodology to optimize several hyper-parameters, based on the computation of the gradient of a model selection criterion with respect to the hyper-parameters. In the case of a quadratic training criterion, the gradient of the selection criterion with respect to the hyper-parameters is efficiently computed by back-propagating through a Cholesky decomposition. In the more general case, we show that the implicit function theorem can be used to derive a formula for the hyper-parameter gradient involving second derivatives of the training criterion

Keywords :

gradient methods; learning (artificial intelligence); minimisation; neural nets; Cholesky decomposition; continuous optimization; hyper-parameter optimization; implicit function theorem; machine learning algorithms; model selection criterion gradient; quadratic training criterion minimization; Bayesian methods; Linear regression; Machine learning; Machine learning algorithms; Minimization methods; Optimization methods; Supervised learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on

Conference_Location :

Como

ISSN :

1098-7576

Print_ISBN :

0-7695-0619-4

Type :

conf

DOI :

10.1109/IJCNN.2000.857853

Filename :

857853

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2259626