Learning the architecture of neural networks for speech recognition

Author

Bodenhausen, Ulrich ; Waibel, Alex

Author_Institution

Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear

1991

fDate

14-17 Apr 1991

Firstpage

117

Abstract

Results are presented that suggest that it is possible to learn the architecture of neural networks for speech recognition systems. The Tempo 2 algorithm is proposed. It is a training algorithm for neural networks that trains the temporal parameters of the network (delays and widths of the input windows) as well as the weights. A comparison of the performances with one adaptive parameter set (either weights, delays or widths) shows that the main parameters are the weights. Delays and widths seem to be of lesser importance, but in combination with the weights the temporal parameters can improve performance, especially generalization. A Tempo 2 network with trained delays and widths and random weights can classify 70% of the phonemes correctly. The application to phoneme classification, shows that this adaptive architecture can approach the performance of a carefully hand-tuned TDNN (time-delay neural network) and leads to more compact networks

Keywords

delays; neural nets; speech recognition; Tempo 2 algorithm; Tempo 2 network; adaptive architecture; adaptive parameter; delays; input windows; learning; neural networks architecture; phoneme classification; speech recognition systems; temporal parameters; time-delay neural network; training algorithm; widths; Computer architecture; Computer networks; Computer science; Delay effects; Feedforward neural networks; Frequency; Multi-layer neural network; Neural networks; Spectrogram; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on

Conference_Location

Toronto, Ont.

ISSN

1520-6149

Print_ISBN

0-7803-0003-3

Type

conf

DOI

10.1109/ICASSP.1991.150292

Filename

150292