Comparing Different Recurrent Neural Architectures on a Specific Task from Vanishing Gradient Effect Perspective

Author

Squartini, Stefano ; Paolinelli, Stefano ; Piazza, Francesco

Author_Institution

A3Lab, Univ. Politecnica delle Marche, Ancona

fYear

0

fDate

0-0 0

Firstpage

380

Lastpage

385

Abstract

The objective of this paper is to compare the performances of different recurrent neural systems when applied to a specific task, namely the latching problem. It is a benchmark for evaluating the impact of the vanishing gradient effect, arising when the neural network under study is trained through gradient based learning algorithms. Three distinct architectures have been addressed, in different configurations: the fully recurrent neural network (fRNN), the recurrent multiscale network (RMN) and the echo state network (ESN), all already known in literature, but never considered together from this perspective. As expected, ESNs seem to be immune to the vanishing gradient problem, whose effect is conversely strong in the case of fRNN and partially (but significantly) mitigated when RMNs are used

Keywords

gradient methods; learning (artificial intelligence); recurrent neural nets; echo state network; fully recurrent neural network; gradient based learning algorithms; latching problem; neural network; recurrent multiscale network; recurrent neural architectures; vanishing gradient effect perspective; Computer architecture; Computer networks; Cost function; Digital signal processing; Employment; Helium; Information analysis; Learning systems; Neural networks; Recurrent neural networks;

fLanguage

English

Publisher

ieee

Conference_Titel

Networking, Sensing and Control, 2006. ICNSC '06. Proceedings of the 2006 IEEE International Conference on

Conference_Location

Ft. Lauderdale, FL

Print_ISBN

1-4244-0065-1

Type

conf

DOI

10.1109/ICNSC.2006.1673176

Filename

1673176