• DocumentCode
    2177055
  • Title

    Acoustic data sharing for Afghan and Persian languages

  • Author

    Mandal, Arindam ; Vergyri, Dimitra ; Akbacak, Murat ; Richey, Colleen ; Kathol, Andreas

  • Author_Institution
    Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    4996
  • Lastpage
    4999
  • Abstract
    In this work, we compare several known approaches for multilingual acoustic modeling for three languages, Dari, Farsi and Pashto, which are of recent geo-political interest. We demonstrate that we can train a single multilingual acoustic model for these languages and achieve recognition accuracy close to that of monolingual (or language-dependent) models. When only a small amount of training data is available for each of these languages, the multilingual model may even outperform the monolingual ones. We also explore adapting the multilingual model to target language data, which are able to achieve improved automatic speech recognition (ASR) performance compared to the monolingual models for both large and small amounts of training data by 3% relative word error rate (WER).
  • Keywords
    speech recognition; ASR; Afghan languages; Persian languages; WER; acoustic data sharing; automatic speech recognition; multilingual acoustic modeling; word error rate; Acoustics; Adaptation models; Data models; Hidden Markov models; Speech; Speech recognition; Training data; language-independent acoustic modeling; languages of Afghanistan; multilingual acoustic modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947478
  • Filename
    5947478