A fully recurrent neural network for recognition of noisy telephone speech

Author

Kasper, K. ; Reininger, H. ; Wolf, D. ; Wüst, H.

Author_Institution

Inst. fur Angewandte Phys., Frankfurt Univ., Germany

Volume

5

fYear

1995

fDate

9-12 May 1995

Firstpage

3331

Abstract

For a variety of telephone applications it is sufficient to realize a speech recognition system (SRS) with a system vocabulary consisting of a few command words, digits, and connected digits. However, in the development of a SRS for application in telephone environment it has to be considered that the speech is bandpass limited and a high recognition performance has to be guaranteed under speaker independent and even adverse conditions. Furthermore, it is important that the SRS is efficiently implementable. Fully recurrent neural networks (FRNN) provide a new approach for realizing a robust SRS with a single network. FRNN are able to perform the process of feature scoring discriminatively and independently of the length of the feature sequence. In SRS based on Hidden Markov Models (HMM), different methods have to be applied for scoring the feature vectors and for compensating the variations in phone durations. Here we report about investigations to realize a monolithic SRS based on FRNN for telephone speech. Besides isolated word recognition, the capability of FRNN-SRS to deal with connected digit recognition is presented. Furthermore, it is shown how FRNN could be immunized against several types of additive noise

Keywords

hidden Markov models; multilayer perceptrons; noise; recurrent neural nets; speech recognition; HMM; additive noise; bandpass limited speech; connected digit recognition; feature scoring; feature sequence; feature vectors; fully recurrent neural network; isolated word recognition; monolithic SRS; noisy telephone speech recognition; phone durations; recognition performance; system vocabulary; Additive noise; Cognition; Hidden Markov models; Neurons; Nonlinear dynamical systems; Recurrent neural networks; Robustness; Speech recognition; Telephony; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on

Conference_Location

Detroit, MI

ISSN

1520-6149

Print_ISBN

0-7803-2431-5

Type

conf

DOI

10.1109/ICASSP.1995.479698

Filename

479698