• DocumentCode
    2963761
  • Title

    Automatic translation from parallel speech: Simultaneous interpretation as MT training data

  • Author

    Paulik, Matthias ; Waibel, Alex

  • Author_Institution
    Interactive Syst. Labs. (interACT), Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    496
  • Lastpage
    501
  • Abstract
    State-of-the art statistical machine translation depends heavily on the availability of domain-specific bilingual parallel text. However, acquiring large amounts of bilingual parallel text is costly and, depending on the language pair, sometimes impossible. We propose an alternative to parallel text as machine translation (MT) training data; audio recordings of parallel speech (pSp) as it occurs in any scenario where interpreters are involved. Although interpretation (pSp) differs significantly from translation (parallel text), we achieve surprisingly strong translation results with our pSp-trained MT and speech translation systems.We argue that the presented approach is of special interest for developing speech translation in the context of resource-deficient languages where even monolingual resources are scarce.
  • Keywords
    language translation; linguistics; speech processing; statistical analysis; text analysis; MT training data; audio recording; automatic speech translation; domain-specific bilingual parallel text; parallel speech; resource-deficient language; statistical machine translation; Art; Audio recording; Automatic speech recognition; Books; Dictionaries; Interactive systems; Laboratories; Large-scale systems; Natural languages; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5372880
  • Filename
    5372880