DocumentCode :
1365266
Title :
HMM-based stressed speech modeling with application to improved synthesis and recognition of isolated speech under stress
Author :
Bou-Ghazale, Sahar E. ; Hansen, John H L
Author_Institution :
Robust Speech Processing Lab., Duke Univ., Durham, NC, USA
Volume :
6
Issue :
3
fYear :
1998
fDate :
5/1/1998 12:00:00 AM
Firstpage :
201
Lastpage :
216
Abstract :
A novel approach is proposed for modeling speech parameter variations between neutral and stressed conditions and employed in a technique for stressed speech synthesis and recognition. The proposed method consists of modeling the variations in pitch contour, voiced speech duration, and average spectral structure using hidden Markov models (HMMs). While HMMs have traditionally been used for recognition applications, here they are employed to statistically model the characteristics needed for generating pitch contour and spectral perturbation contour patterns to modify the speaking style of isolated neutral words. The proposed HMM models are both speaker and word-independent, but unique to each speaking style. While the modeling scheme is applicable to a variety of stress and emotional speaking styles, the evaluations presented focus on angry speech, the Lombard (1911) effect, and loud spoken speech in three areas. First, formal subjective listener evaluations of the modified speech confirm the HMMs ability to capture the parameter variations under stressed conditions. Second, an objective evaluation using a separately formulated stress classifier is employed to assess the presence of stress imparted on the synthetic speech. Finally, the stressed speech is also used for training and shown to measurably improve the performance of an HMM-based stressed speech recognizer
Keywords :
hidden Markov models; spectral analysis; speech recognition; speech synthesis; statistical analysis; HMM; HMM-based stressed speech modeling; Lombard effect; angry speech; average spectral structure; emotional speaking styles; formal subjective listener evaluations; hidden Markov models; isolated speech; loud spoken speech; neutral conditions; objective evaluation; pitch contour; speaker-independent model; spectral perturbation contour patterns; speech parameter variations; statistical model; stress classifier; stressed conditions; stressed speech recognition; stressed speech synthesis; training; voiced speech duration; word-independent model; Character recognition; Hidden Markov models; Laboratories; Pattern recognition; Robustness; Speech analysis; Speech processing; Speech recognition; Speech synthesis; Stress;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.668815
Filename :
668815
Link To Document :
بازگشت