Title :
Experiments with temporal resolution for continuous speech recognition with multi-layer perceptrons
Author :
Morgan, Nelson ; Wooters, C. ; Hermansky, Hynek
Author_Institution :
Int. Comput. Sci. Inst., Berkeley, CA, USA
fDate :
30 Sep-1 Oct 1991
Abstract :
Previous work by the authors focused on the integration of multilayer perceptrons (MLP) into hidden Markov models (HMM) and on the use of perceptual linear prediction (PLP) parameters for the feature inputs to such nets. The system uses the Viterbi algorithm for temporal alignment. This algorithm is a simple and optimal procedure, but it necessitates a frame-based analysis in which all features have the same implicit time constants. The authors provide a range of temporal/spectral resolution choices to a frame-based system by using a layered network to incorporate this information for phonetic discrimination. They have performed experiments in which they expanded their PLP analysis to include short analysis windows, and in which they trained phonetic classification networks to incorporate this added information. They hypothesized that classification scores would improve, especially for short-duration phonemes. These experiments did not yield the expected improvement
Keywords :
hidden Markov models; neural nets; speech analysis and processing; speech recognition; Viterbi algorithm; continuous speech recognition; frame-based analysis; hidden Markov models; multi-layer perceptrons; multilayer perceptrons; perceptual linear prediction; phonetic classification networks; short-duration phonemes; temporal alignment; temporal resolution; Auditory system; Hidden Markov models; Information analysis; Multilayer perceptrons; Natural languages; Performance analysis; Signal resolution; Spatial databases; Speech recognition; Viterbi algorithm;
Conference_Titel :
Neural Networks for Signal Processing [1991]., Proceedings of the 1991 IEEE Workshop
Conference_Location :
Princeton, NJ
Print_ISBN :
0-7803-0118-8
DOI :
10.1109/NNSP.1991.239501