مرکز منطقه ای اطلاع رساني علوم و فناوري - Using word temporal structure in HMM speech recognition

DocumentCode :

310559

Title :

Using word temporal structure in HMM speech recognition

Author :

Fissore, L. ; Laface, P. ; Ravera, F.

Author_Institution :

CSELT, Torino, Italy

Volume :

fYear :

1997

fDate :

21-24 Apr 1997

Firstpage :

975

Abstract :

Isolated word speech recognizers with fixed vocabularies are often used to provide vocal services through the telephone line. The paper illustrates a simple postprocessing approach that allows the hypotheses produced by a hidden Markov model recognizer to be rescored taking into account the global temporal structure of the pronounced words. Our approach does not directly rely on state/word duration modeling. It models, instead, the global time variations of the spectral features of each word and their correlation in time: two important perceptual cues that are only partially exploited by standard HMMs. This method has been evaluated using three isolated word speaker independent systems with vocabulary of different size and complexity. We show that, with minimal overhead, the recognition performance improves not only for small vocabulary recognition systems such as the isolated digit one, or for the recognition of 26 Italian spelling names, but also for a system with a 475 city name vocabulary included in a vocal service that provides information about the main railway connections

Keywords :

correlation methods; hidden Markov models; spectral analysis; speech processing; speech recognition; telephone lines; voice communication; HMM speech recognition; Italian spelling names; city name vocabulary; correlation; fixed vocabularies; global temporal structure; global time variations; hidden Markov model recognizer; isolated word speaker independent systems; isolated word speech recognizers; main railway connections; perceptual cues; postprocessing approach; pronounced words; recognition performance; small vocabulary recognition systems; spectral features; telephone line; vocabulary complexity; vocabulary size; vocal services; word temporal structure; Cepstral analysis; Computational complexity; Decoding; Hidden Markov models; Laboratories; Predictive models; Robustness; Speech recognition; Vectors; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location :

Munich

ISSN :

1520-6149

Print_ISBN :

0-8186-7919-0

Type :

conf

DOI :

10.1109/ICASSP.1997.596101

Filename :

596101

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=310559