• DocumentCode
    2769797
  • Title

    Advances in Arabic broadcast news transcription at RWTH

  • Author

    Rybach, David ; Hahn, Stefan ; Gollan, Christian ; Schlüter, Ralf ; Ney, Hermann

  • Author_Institution
    RWTH Aachen Univ., Aachen
  • fYear
    2007
  • fDate
    9-13 Dec. 2007
  • Firstpage
    449
  • Lastpage
    454
  • Abstract
    This paper describes the RWTH speech recognition system for Arabic. Several design aspects of the system, including cross-adaptation, multiple system design and combination, are analyzed. We summarize the semi-automatic lexicon generation for Arabic using a statistical approach to grapheme-to-phoneme conversion and pronunciation statistics. Furthermore, a novel ASR-based audio segmentation algorithm is presented. Finally, we discuss practical approaches for parallelized acoustic training and memory efficient lattice rescoring. Systematic results are reported on recent GALE evaluation corpora.
  • Keywords
    audio signal processing; broadcasting; natural language processing; speech recognition; statistical analysis; Arabic broadcast news transcription; GALE evaluation corpora; RWTH speech recognition system; audio segmentation algorithm; global autonomous language exploitation; grapheme-to-phoneme conversion; memory efficient lattice rescoring; parallelized acoustic training; pronunciation statistics; semi automatic lexicon generation; Broadcasting; Cepstral analysis; Hidden Markov models; Humans; Lattices; Loudspeakers; Mel frequency cepstral coefficient; Natural languages; Neural networks; Speech recognition; Audio Segmentation; Cross-Adaptation; Speech Recognition; System Combination;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
  • Conference_Location
    Kyoto
  • Print_ISBN
    978-1-4244-1746-9
  • Electronic_ISBN
    978-1-4244-1746-9
  • Type

    conf

  • DOI
    10.1109/ASRU.2007.4430154
  • Filename
    4430154