• DocumentCode
    290045
  • Title

    Duration and spectral based stress token generation for HMM speech recognition under stress

  • Author

    Bou-Ghazale, Sahar E. ; Hansen, John H L

  • Author_Institution
    Dept. of Electr. Eng., Duke Univ., Durham, NC, USA
  • Volume
    i
  • fYear
    1994
  • fDate
    19-22 Apr 1994
  • Abstract
    In this paper, we address the problem of isolated word recognition of speech under various stressed speaking conditions. The main objective is to formulate an alternate training algorithm for hidden Markov model recognition, which better characterizes actual speech production under stressed speaking styles such as slow, loud and Lombard (1911) effect. Without the need for collecting such stressed speech data. The novel approach is to first construct a previously suggested source generator model of word production employing knowledge of the statistical nature of duration and spectral variation of speech under stress. This is used in turn to produce simulated stressed speech training tokens from neutral tokens, and thus replace neutral data used in the recognizer training phase. The token generation training method is shown to improve isolated word recognition by 8% for slow speaking style, 14% for loud speaking style, and 24% for speech under Lombard effect when compared to neutral trained isolated word recognition
  • Keywords
    hidden Markov models; spectral analysis; speech recognition; HMM speech recognition; Lombard effect; duration; hidden Markov model recognition; isolated word recognition; loud speaking style; recognizer training; simulated stressed speech training tokens; slow speaking style; source generator model; spectral variation; speech production; statistics; stress token generation; stressed speaking conditions; stressed speaking styles; training algorithm; training method; Automatic speech recognition; Data mining; Hidden Markov models; Humans; Laboratories; Speech enhancement; Speech processing; Speech recognition; Stress; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
  • Conference_Location
    Adelaide, SA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-1775-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1994.389268
  • Filename
    389268