The LIMSI continuous speech dictation system: evaluation on the ARPA Wall Street Journal task

Author

Gauvain, J.-L. ; Lamel, L.F. ; Adda, G. ; Adda-Decker, M.

Author_Institution

Lab. d´´Informatique pour la Mecanique et les Sci. de l´´Ingenieur, CNRS, Orsay, France

Volume

i

fYear

1994

fDate

19-22 Apr 1994

Abstract

We report progress made at LIMSI in speaker-independent large vocabulary speech dictation using the ARPA Wall Street Journal-based CSR corpus. The recognizer makes use of continuous density HMM with Gaussian mixture for acoustic modeling and n-gram statistics estimated on the newspaper texts for language modeling. The recognizer uses a time-synchronous graph-search strategy which is shown to still be viable with vocabularies of up to 20 K words when used with bigram back-off language models. A second forward pass, which makes use of a word graph generated with the bigram, incorporates a trigram language model. Acoustic modeling uses cepstrum-based features, context-dependent phone models (intra and interword), phone duration models, and sex-dependent models. The recognizer has been evaluated in the Nov92 and Nov93 ARPA tests for vocabularies of up to 20,000 words

Keywords

Gaussian processes; acoustic analysis; dictation; graph theory; hidden Markov models; natural languages; search problems; speech recognition; speech recognition equipment; ARPA Wall Street Journal task; CSR corpus; Gaussian mixture; LIMSI continuous speech dictation system; Nov92 ARPA tests; Nov93 ARPA tests; acoustic modeling; bigram back-off language models; cepstrum-based features; context-dependent phone models; continuous density HMM; language modeling; large vocabulary; newspaper texts; phone duration models; second forward pass; sex-dependent models; speaker-independent speech dictation; time-synchronous graph-search; trigram language model; word graph; Context modeling; Hidden Markov models; Materials testing; Smoothing methods; Speech analysis; Speech recognition; Statistics; Text recognition; Training data; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on

Conference_Location

Adelaide, SA

ISSN

1520-6149

Print_ISBN

0-7803-1775-0

Type

conf

DOI

10.1109/ICASSP.1994.389233

Filename

389233