• DocumentCode
    37780
  • Title

    Cross-Lingual Language Modeling for Low-Resource Speech Recognition

  • Author

    Ping Xu ; Fung, Pascale

  • Author_Institution
    Hong Kong Univ. of Sci. & Technol., Hong Kong, China
  • Volume
    21
  • Issue
    6
  • fYear
    2013
  • fDate
    Jun-13
  • Firstpage
    1134
  • Lastpage
    1144
  • Abstract
    This paper proposes using cross-lingual language modeling with syntactic information for low-resource speech recognition. We propose phrase-level transduction and syntactic reordering for transcribing a resource-poor language and translating it into a resource-rich language, if necessary. The phrase-level transduction is capable of performing n -m cross-lingual transduction. The syntactic reordering serves to model the syntactic discrepancies between the source and target languages. Our purpose is to leverage the statistics in a resource-rich language model to improve the language model of a resource-poor language and at the same time to improve low-resource speech recognition performance. We implement our cross-lingual language model using weighted finite-state transducers (WFSTs), and integrate it into a WFST-based speech recognition search space to output the transcriptions of both resource-poor and resource-rich languages. This creates an integrated speech transcription and translation framework. Evaluations on Cantonese speech transcription and Cantonese to standard Chinese translation tasks show that our proposed approach improves the system performance significantly, with up to 12.5% relative character error rate (CER) reduction over baseline language model interpolation, 6.6% relative CER reduction and 18.5% relative BLEU score improvement, compared to the best word-level transduction approach.
  • Keywords
    linguistics; natural language processing; speech recognition; Cantonese speech transcription; Chinese translation tasks; WFST-based speech recognition search space; best word-level transduction approach; character error rate reduction; cross-lingual language modeling; integrated speech transcription; integrated speech translation; low-resource speech recognition performance; phrase-level transduction; resource-poor language transcription; resource-poor language translation; resource-rich language model; statistics; syntactic discrepancies; syntactic information; syntactic reordering; weighted finite-state transducers; Context modeling; Interpolation; Speech; Speech processing; Speech recognition; Standards; Syntactics; Cross-lingual language modeling; WFST; low-resource speech recognition; syntactic reordering;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2013.2244088
  • Filename
    6425426