The Karlsruhe-Verbmobil speech recognition engine

Author

Finke, Michael ; Geutner, Petra ; Hild, Hermann ; Kemp, Thomas ; Ries, Klaus ; Westphal, Martin

Author_Institution

Interactive Syst. Labs., Karlsruhe Univ., Germany

Volume

1

fYear

1997

fDate

21-24 Apr 1997

Firstpage

83

Abstract

Verbmobil, a German research project, aims at machine translation of spontaneous speech input. The ultimate goal is the development of a portable machine translator that will allow people to negotiate in their native language. Within this project the University of Karlsruhe has developed a speech recognition engine that has been evaluated on a yearly basis during the project and shows very promising speech recognition word accuracy results on large vocabulary spontaneous speech. We introduce the Janus Speech Recognition Toolkit underlying the speech recognizer. The main new contributions to the acoustic modeling part of our 1996 evaluation system-speaker normalization, channel normalization and polyphonic clustering-are discussed and evaluated. Besides the acoustic models we delineate the different language models used in our evaluation system: word trigram models interpolated with class based models and a separate spelling language model were applied. As a result of using the toolkit and integrating all these parts into the recognition engine the word error rate on the German spontaneous scheduling task (GSST) could be decreased from 30% word error rate in 1995 to 13.8% in 1996

Keywords

computational linguistics; errors; interpolation; language translation; natural language interfaces; performance evaluation; scheduling; speech recognition; vocabulary; German research project; German spontaneous scheduling task; Janus Speech Recognition Toolkit; Karlsruhe-Verbmobil speech recognition engine; University of Karlsruhe; acoustic modeling; channel normalization; class based models; language models; large vocabulary spontaneous speech; machine translation; polyphonic clustering; speaker normalization; spelling language model; spontaneous speech input; word accuracy; word error rate; word trigram models; Engines; Error analysis; Handwriting recognition; Hidden Markov models; Interactive systems; Laboratories; Natural languages; Object oriented modeling; Speech analysis; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.599552

Filename

599552