DocumentCode
311339
Title
The Karlsruhe-Verbmobil speech recognition engine
Author
Finke, Michael ; Geutner, Petra ; Hild, Hermann ; Kemp, Thomas ; Ries, Klaus ; Westphal, Martin
Author_Institution
Interactive Syst. Labs., Karlsruhe Univ., Germany
Volume
1
fYear
1997
fDate
21-24 Apr 1997
Firstpage
83
Abstract
Verbmobil, a German research project, aims at machine translation of spontaneous speech input. The ultimate goal is the development of a portable machine translator that will allow people to negotiate in their native language. Within this project the University of Karlsruhe has developed a speech recognition engine that has been evaluated on a yearly basis during the project and shows very promising speech recognition word accuracy results on large vocabulary spontaneous speech. We introduce the Janus Speech Recognition Toolkit underlying the speech recognizer. The main new contributions to the acoustic modeling part of our 1996 evaluation system-speaker normalization, channel normalization and polyphonic clustering-are discussed and evaluated. Besides the acoustic models we delineate the different language models used in our evaluation system: word trigram models interpolated with class based models and a separate spelling language model were applied. As a result of using the toolkit and integrating all these parts into the recognition engine the word error rate on the German spontaneous scheduling task (GSST) could be decreased from 30% word error rate in 1995 to 13.8% in 1996
Keywords
computational linguistics; errors; interpolation; language translation; natural language interfaces; performance evaluation; scheduling; speech recognition; vocabulary; German research project; German spontaneous scheduling task; Janus Speech Recognition Toolkit; Karlsruhe-Verbmobil speech recognition engine; University of Karlsruhe; acoustic modeling; channel normalization; class based models; language models; large vocabulary spontaneous speech; machine translation; polyphonic clustering; speaker normalization; spelling language model; spontaneous speech input; word accuracy; word error rate; word trigram models; Engines; Error analysis; Handwriting recognition; Hidden Markov models; Interactive systems; Laboratories; Natural languages; Object oriented modeling; Speech analysis; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location
Munich
ISSN
1520-6149
Print_ISBN
0-8186-7919-0
Type
conf
DOI
10.1109/ICASSP.1997.599552
Filename
599552
Link To Document