مرکز منطقه ای اطلاع رساني علوم و فناوري - Optimal gradient-based learning using importance weights

DocumentCode :

445804

Title :

Optimal gradient-based learning using importance weights

Author :

Hochreiter, Sepp ; Obermayer, Klaus

Author_Institution :

Bernstein Center for Computational Neurosci., Berlin, Germany

Volume :

fYear :

2005

fDate :

31 July-4 Aug. 2005

Firstpage :

114

Abstract :

We introduce a novel "importance weight" method (IW) to speed up learning of "difficult" data sets including unbalanced data, highly non-linear data, or long-term dependencies in sequences. An importance weight is assigned to every training data point and controls its contribution to the total weight update. The importance weights are obtained by solving a quadratic optimization problem and determines the learning informativeness of a data point. For linear classifiers we show, that IW is equivalent to standard support vector learning. We apply IW to feedforward multi-layer perceptrons and to recurrent neural networks (LSTM). Benchmarks with QuickProp and standard gradient descent methods show that IW is usually much faster in terms of epochs as well as in terms of absolute CPU time, and that it provides equal or better prediction results. IW improved gradient descent results on "real world" protein datasets. In the "latching benchmark" for sequence prediction, IW was able to extract dependencies between sites which are 1,000,000 sequence elements apart - a new record.

Keywords :

feedforward neural nets; gradient methods; learning (artificial intelligence); multilayer perceptrons; optimisation; pattern classification; QuickProp; feedforward multilayer perceptrons; gradient descent methods; importance weight method; linear classifiers; optimal gradient-based learning; quadratic optimization problem; recurrent neural networks; support vector learning; total weight update; training data point; Error correction; Multilayer perceptrons; Newton method; Optimization methods; Proteins; Recurrent neural networks; Supervised learning; Text categorization; Training data; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on

Print_ISBN :

0-7803-9048-2

Type :

conf

DOI :

10.1109/IJCNN.2005.1555815

Filename :

1555815

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=445804