مرکز منطقه ای اطلاع رساني علوم و فناوري - The problem of learning long-term dependencies in recurrent networks

DocumentCode :

1906597

Title :

The problem of learning long-term dependencies in recurrent networks

Author :

Bengio, Yoshua ; Frasconi, Paolo ; Simard, Patrice

Author_Institution :

AT&T Bell Lab., Murray Hill, NJ, USA

fYear :

1993

fDate :

1993

Firstpage :

1183

Abstract :

The authors seek to train recurrent neural networks in order to map input sequences to output sequences, for applications in sequence recognition or production. Results are presented showing that learning long-term dependencies in such recurrent networks using gradient descent is a very difficult task. It is shown how this difficulty arises when robustly latching bits of information with certain attractors. The derivatives of the output at time t with respect to the unit activations at time zero tend rapidly to zero as t increases for most input values. In such a situation, simple gradient descent techniques appear inappropriate. The consideration of alternative optimization methods and architectures is suggested

Keywords :

learning (artificial intelligence); recurrent neural nets; gradient descent; input sequences; long-term dependencies; neural networks; optimization methods; output sequences; recurrent networks; sequence recognition; unit activations; Background noise; Discrete transforms; Intelligent networks; Neural networks; Optimization methods; Production; Recurrent neural networks; Robustness; Speech; Text recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 1993., IEEE International Conference on

Conference_Location :

San Francisco, CA

Print_ISBN :

0-7803-0999-5

Type :

conf

DOI :

10.1109/ICNN.1993.298725

Filename :

298725

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1906597