• DocumentCode
    311339
  • Title

    The Karlsruhe-Verbmobil speech recognition engine

  • Author

    Finke, Michael ; Geutner, Petra ; Hild, Hermann ; Kemp, Thomas ; Ries, Klaus ; Westphal, Martin

  • Author_Institution
    Interactive Syst. Labs., Karlsruhe Univ., Germany
  • Volume
    1
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    83
  • Abstract
    Verbmobil, a German research project, aims at machine translation of spontaneous speech input. The ultimate goal is the development of a portable machine translator that will allow people to negotiate in their native language. Within this project the University of Karlsruhe has developed a speech recognition engine that has been evaluated on a yearly basis during the project and shows very promising speech recognition word accuracy results on large vocabulary spontaneous speech. We introduce the Janus Speech Recognition Toolkit underlying the speech recognizer. The main new contributions to the acoustic modeling part of our 1996 evaluation system-speaker normalization, channel normalization and polyphonic clustering-are discussed and evaluated. Besides the acoustic models we delineate the different language models used in our evaluation system: word trigram models interpolated with class based models and a separate spelling language model were applied. As a result of using the toolkit and integrating all these parts into the recognition engine the word error rate on the German spontaneous scheduling task (GSST) could be decreased from 30% word error rate in 1995 to 13.8% in 1996
  • Keywords
    computational linguistics; errors; interpolation; language translation; natural language interfaces; performance evaluation; scheduling; speech recognition; vocabulary; German research project; German spontaneous scheduling task; Janus Speech Recognition Toolkit; Karlsruhe-Verbmobil speech recognition engine; University of Karlsruhe; acoustic modeling; channel normalization; class based models; language models; large vocabulary spontaneous speech; machine translation; polyphonic clustering; speaker normalization; spelling language model; spontaneous speech input; word accuracy; word error rate; word trigram models; Engines; Error analysis; Handwriting recognition; Hidden Markov models; Interactive systems; Laboratories; Natural languages; Object oriented modeling; Speech analysis; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.599552
  • Filename
    599552