Improved lexicon modeling for continuous speech recognition

Author

Yun, Seong Jin ; Oh, Yung Hwan ; Shin, Gyung Chul

Author_Institution

Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Taejon, South Korea

Volume

3

fYear

1997

Firstpage

1827

Abstract

We propose the stochastic lexicon model which represents the pronunciation variations to optimally cope with the continuous speech recognizer. In this lexicon model, the baseform of words are represented by subword states and the probability distribution of subwords as a hidden Markov model. Also, the proposed approach can be applied to a system employing non-linguistic recognition units and the lexicon is automatically trained from training utterances. In speaker independent speech recognition tests using a 3000 word continuous speech database, the proposed system improves the word accuracy by about 27.8% and the sentence accuracy by about 22.4%

Keywords

hidden Markov models; probability; speech processing; speech recognition; stochastic processes; continuous speech database; continuous speech recognition; hidden Markov mode; lexicon modeling; nonlinguistic recognition units; pronunciation variations; sentence accuracy; speaker independent speech recognition tests; stochastic lexicon model; subword states; subwords probability distribution; training utterances; word accuracy; word baseform; Automatic speech recognition; Computer science; Databases; Electronic mail; Hidden Markov models; Probability distribution; Speech recognition; Stochastic processes; Stochastic systems; System testing;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.598892

Filename

598892