DocumentCode
290118
Title
An efficient combination of acoustic and supra-segmental informations in a speech recognition system
Author
Suaudeau, Nelly ; André-Obrecht, Régine
Author_Institution
IRISA, Rennes, France
Volume
i
fYear
1994
fDate
19-22 Apr 1994
Abstract
A major deficiency of a standard HMM is that both the spectral and the prosodic features are uniformly processed. To more efficiently combine the prosodic cues together with the acoustic ones, a two level HMM which separates the spectral and suprasegmental representations is defined. Namely, the incorporation of global sound durations is explored. More, to take into account the effects of speaking rate on the phonetic unit durational parameters, two durational models are proposed. The ways those models are integrated in the recognition processing are described. Experiments on a French number database show that such an explicit introduction of prosodic parameters reduces recognition errors rates by 20%
Keywords
hidden Markov models; signal representation; spectral analysis; speech recognition; French number database; acoustic information; global sound duration; phonetic unit durational parameters; prosodic features; recognition error rates; recognition processing; representation; speaking rate; spectral features; speech recognition system; suprasegmental information; two level HMM; Databases; Error analysis; Hidden Markov models; Histograms; Network topology; Probability distribution; Solid modeling; Speech processing; Speech recognition; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location
Adelaide, SA
ISSN
1520-6149
Print_ISBN
0-7803-1775-0
Type
conf
DOI
10.1109/ICASSP.1994.389354
Filename
389354
Link To Document