• DocumentCode
    290014
  • Title

    The LIMSI continuous speech dictation system: evaluation on the ARPA Wall Street Journal task

  • Author

    Gauvain, J.-L. ; Lamel, L.F. ; Adda, G. ; Adda-Decker, M.

  • Author_Institution
    Lab. d´´Informatique pour la Mecanique et les Sci. de l´´Ingenieur, CNRS, Orsay, France
  • Volume
    i
  • fYear
    1994
  • fDate
    19-22 Apr 1994
  • Abstract
    We report progress made at LIMSI in speaker-independent large vocabulary speech dictation using the ARPA Wall Street Journal-based CSR corpus. The recognizer makes use of continuous density HMM with Gaussian mixture for acoustic modeling and n-gram statistics estimated on the newspaper texts for language modeling. The recognizer uses a time-synchronous graph-search strategy which is shown to still be viable with vocabularies of up to 20 K words when used with bigram back-off language models. A second forward pass, which makes use of a word graph generated with the bigram, incorporates a trigram language model. Acoustic modeling uses cepstrum-based features, context-dependent phone models (intra and interword), phone duration models, and sex-dependent models. The recognizer has been evaluated in the Nov92 and Nov93 ARPA tests for vocabularies of up to 20,000 words
  • Keywords
    Gaussian processes; acoustic analysis; dictation; graph theory; hidden Markov models; natural languages; search problems; speech recognition; speech recognition equipment; ARPA Wall Street Journal task; CSR corpus; Gaussian mixture; LIMSI continuous speech dictation system; Nov92 ARPA tests; Nov93 ARPA tests; acoustic modeling; bigram back-off language models; cepstrum-based features; context-dependent phone models; continuous density HMM; language modeling; large vocabulary; newspaper texts; phone duration models; second forward pass; sex-dependent models; speaker-independent speech dictation; time-synchronous graph-search; trigram language model; word graph; Context modeling; Hidden Markov models; Materials testing; Smoothing methods; Speech analysis; Speech recognition; Statistics; Text recognition; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
  • Conference_Location
    Adelaide, SA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-1775-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1994.389233
  • Filename
    389233