Duration and spectral based stress token generation for HMM speech recognition under stress

Author

Bou-Ghazale, Sahar E. ; Hansen, John H L

Author_Institution

Dept. of Electr. Eng., Duke Univ., Durham, NC, USA

Volume

i

fYear

1994

fDate

19-22 Apr 1994

Abstract

In this paper, we address the problem of isolated word recognition of speech under various stressed speaking conditions. The main objective is to formulate an alternate training algorithm for hidden Markov model recognition, which better characterizes actual speech production under stressed speaking styles such as slow, loud and Lombard (1911) effect. Without the need for collecting such stressed speech data. The novel approach is to first construct a previously suggested source generator model of word production employing knowledge of the statistical nature of duration and spectral variation of speech under stress. This is used in turn to produce simulated stressed speech training tokens from neutral tokens, and thus replace neutral data used in the recognizer training phase. The token generation training method is shown to improve isolated word recognition by 8% for slow speaking style, 14% for loud speaking style, and 24% for speech under Lombard effect when compared to neutral trained isolated word recognition

Keywords

hidden Markov models; spectral analysis; speech recognition; HMM speech recognition; Lombard effect; duration; hidden Markov model recognition; isolated word recognition; loud speaking style; recognizer training; simulated stressed speech training tokens; slow speaking style; source generator model; spectral variation; speech production; statistics; stress token generation; stressed speaking conditions; stressed speaking styles; training algorithm; training method; Automatic speech recognition; Data mining; Hidden Markov models; Humans; Laboratories; Speech enhancement; Speech processing; Speech recognition; Stress; Working environment noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on

Conference_Location

Adelaide, SA

ISSN

1520-6149

Print_ISBN

0-7803-1775-0

Type

conf

DOI

10.1109/ICASSP.1994.389268

Filename

389268