DocumentCode
312003
Title
A category based approach for recognition of out-of-vocabulary words
Author
Gallwitz, F. ; Nöth, E. ; Niemann, H.
Author_Institution
Lehrstuhl fur Mustererkennung, Erlangen-Nurnberg Univ., Germany
Volume
1
fYear
1996
fDate
3-6 Oct 1996
Firstpage
228
Abstract
In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a significant amount of out-of-vocabulary words even when the vocabulary size is very large. We present a new approach for the integration of out-of-vocabulary words into statistical language models. We use category information for all words in the training corpus to define a function that gives an approximation of the out-of-vocabulary word emission probability for each word category. This information is integrated into the language models. Although we use a simple acoustic model for out-of-vocabulary words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate
Keywords
computational linguistics; natural language interfaces; probability; speech recognition; statistical analysis; vocabulary; acoustic model; automatic speech recognition; category based approach; out-of-vocabulary word recognition; spontaneous speech data; spontaneous speech tasks; statistical language models; training corpus; vocabulary; word emission probability; word error rate; Acoustic applications; Acoustic emission; Automatic speech recognition; Context modeling; Information retrieval; Predictive models; Probability; Speech recognition; Telephony; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607083
Filename
607083
Link To Document