DocumentCode :
310559
Title :
Using word temporal structure in HMM speech recognition
Author :
Fissore, L. ; Laface, P. ; Ravera, F.
Author_Institution :
CSELT, Torino, Italy
Volume :
2
fYear :
1997
fDate :
21-24 Apr 1997
Firstpage :
975
Abstract :
Isolated word speech recognizers with fixed vocabularies are often used to provide vocal services through the telephone line. The paper illustrates a simple postprocessing approach that allows the hypotheses produced by a hidden Markov model recognizer to be rescored taking into account the global temporal structure of the pronounced words. Our approach does not directly rely on state/word duration modeling. It models, instead, the global time variations of the spectral features of each word and their correlation in time: two important perceptual cues that are only partially exploited by standard HMMs. This method has been evaluated using three isolated word speaker independent systems with vocabulary of different size and complexity. We show that, with minimal overhead, the recognition performance improves not only for small vocabulary recognition systems such as the isolated digit one, or for the recognition of 26 Italian spelling names, but also for a system with a 475 city name vocabulary included in a vocal service that provides information about the main railway connections
Keywords :
correlation methods; hidden Markov models; spectral analysis; speech processing; speech recognition; telephone lines; voice communication; HMM speech recognition; Italian spelling names; city name vocabulary; correlation; fixed vocabularies; global temporal structure; global time variations; hidden Markov model recognizer; isolated word speaker independent systems; isolated word speech recognizers; main railway connections; perceptual cues; postprocessing approach; pronounced words; recognition performance; small vocabulary recognition systems; spectral features; telephone line; vocabulary complexity; vocabulary size; vocal services; word temporal structure; Cepstral analysis; Computational complexity; Decoding; Hidden Markov models; Laboratories; Predictive models; Robustness; Speech recognition; Vectors; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
ISSN :
1520-6149
Print_ISBN :
0-8186-7919-0
Type :
conf
DOI :
10.1109/ICASSP.1997.596101
Filename :
596101
Link To Document :
بازگشت