Title :
Adaptive vocabularies for transcribing multilingual broadcast news
Author :
Geutner, P. ; Finke, M. ; Scheytt, P.
Author_Institution :
Interactive Syst. Labs., Karlsruhe Univ., Germany
Abstract :
One of the most prevailing problems of large-vocabulary speech recognition systems is the large number of out-of-vocabulary words. This is especially the case for automatically transcribing broadcast news in languages other than English, that have a large number of inflections and compound words. We introduce a set of techniques to decrease the number of out-of-vocabulary words during recognition by using linguistic knowledge about morphology and a two-pass recognition approach, where the first pass only serves to dynamically adapt the recognition dictionary to the speech segment to be recognized. A second recognition run is then carried out on the adapted vocabulary. With the proposed techniques we were able to reduce the OOV-rate by more than 40% thereby also improving the recognition results by an absolute 5.8% from a 64% word accuracy to 69.8%
Keywords :
adaptive systems; broadcasting; speech recognition; adaptive vocabularies; compound words; inflections; large-vocabulary speech recognition systems; linguistic knowledge; morphology; multilingual broadcast news transcription; out-of-vocabulary words; recognition dictionary; recognition results; speech segment; two-pass recognition approach; word accuracy; Automatic speech recognition; Broadcasting; Dictionaries; Engines; Interactive systems; Laboratories; Lattices; Natural languages; Speech recognition; Vocabulary;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675417