Large vocabulary continuous speech recognition of Wall Street Journal data

Author

Aubert, X. ; Dugast, C. ; Ney, H. ; Steinbiss, V.

Author_Institution

Philips GmbH Res. Lab. Aachen, Germany

Volume

ii

fYear

1994

fDate

19-22 Apr 1994

Abstract

We report on recent developments of the Philips large vocabulary speech recognition system and on our experiments with the Wall Street Journal (WSJ) corpus. A two-pass decoding has been devised that allows an easy integration of more complex language models. First, a word lattice is produced using a time synchronous beam search with a bigram language model. Next, a higher-order language model is applied to the lattice at the phrase level. The conditions insuring the validity of this approach are explained and practical results for trigram demonstrate its usefulness. The main system development stages on WSJ data are presented and our final recognizers are evaluated on Nov. ´92 and Nov. ´93 test-data for both 5 K and 20 K vocabularies

Keywords

decoding; dictation; speech recognition; vocabulary; Philips dictation system; WSJ data; Wall Street Journal data; bigram language model; higher-order language model; language models; large vocabulary continuous speech recognition; time synchronous beam search; two-pass decoding; word lattice; Acoustic beams; Acoustic testing; Decoding; Hidden Markov models; Laboratories; Lattices; Speech recognition; System testing; Vectors; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on

Conference_Location

Adelaide, SA

ISSN

1520-6149

Print_ISBN

0-7803-1775-0

Type

conf

DOI

10.1109/ICASSP.1994.389702

Filename

389702