A 10000-word continuous-speech recognition system

Author

Steinbiss, K. ; Noll, A. ; Peaseler, A. ; Nye, H. ; Bergmann, H. ; Dugast, C. ; Hamer, H.H. ; Piotrowski, H. ; Tomaschewski, H. ; Zielinski, A.

Author_Institution

Philips GmbH Forschungslab., Hamburg, West Germany

fYear

1990

fDate

3-6 Apr 1990

Firstpage

57

Abstract

Some results obtained when the recognition vocabulary size of a phoneme-based speaker-dependent continuous-speech recognizer was increased from 1000 to 10000 words are reported. The potential search space increased from 46000 to 516000 states without problems for the data-driven search. Increasing the recognition vocabulary by a factor of 10 (from a perplexity of 917 to 9686) increased the word error rate by a factor of two (from 21.8% to 43.1%). Phoneme models were tested with both discrete probabilities and continuous mixture densities. The mixture density models performed better; moreover, they saved about half of the search costs. A language model was found to be very important for a larger vocabulary size. With a test set perplexity of 388 (i.e. a reduction by a factor of 25 compared to the case without a bigram model) the error rate decreased by a factor of 2.4. In order to check how meaningful perplexity is for the prediction of the system´s performance, a stochastic language model was constructed with a perplexity of 1000, the size of the vocabulary used in previous experiments, and about the same error rate was obtained

Keywords

speech analysis and processing; speech recognition; German language; bigram model; continuous-speech recognition system; data-driven search; language model; mixture density models; phoneme-based speaker-dependent continuous-speech recognizer; potential search space; stochastic language model; vocabulary of 10000 words; word error rate; Acoustic testing; Costs; Error analysis; Hidden Markov models; Natural languages; Predictive models; Speech; Stochastic systems; System performance; System testing; Testing; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on

Conference_Location

Albuquerque, NM

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.1990.115536

Filename

115536