Use of acoustic sentence level and lexical stress in HSMM speech recognition

Author

Hieronymus, J.L. ; McKelvie, D. ; McInnes, F.R.

Author_Institution

Center for Speech Technol. Res., Edinburgh Univ., UK

Volume

1

fYear

1992

fDate

23-26 Mar 1992

Firstpage

225

Abstract

The authors describe the results of an experiment to study the effectiveness of using acoustic stress to improve automatic speech recognition. The CSTR speech recognition system uses hidden semi-Markov models (HSMM) with a separate lexical search component. A hybrid prosodic component has been included which determines the sentence level stress and marks the vowel of stressed syllables as stressed in the phoneme lattice. Lexical stress is marked on all content words in the lexicon. Adding stress information to the system in this way resulted in a 65% reduction in word error rate and a 45% reduction in sentence error rate, relative to a baseline system without prosody

Keywords

error statistics; hidden Markov models; speech recognition; HSMM speech recognition; acoustic sentence level; acoustic stress; automatic speech recognition; hybrid prosodic component; lexical stress; sentence error rate; sentence level stress; word error rate; Automatic speech recognition; Bridges; Continuous-stirred tank reactor; Costs; Dynamic programming; Error analysis; Lattices; Matrices; Speech recognition; Stress;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on

Conference_Location

San Francisco, CA

ISSN

1520-6149

Print_ISBN

0-7803-0532-9

Type

conf

DOI

10.1109/ICASSP.1992.225931

Filename

225931