• DocumentCode
    1689786
  • Title

    Rapid development of a Latvian speech-to-text system

  • Author

    Oparin, Ilya ; Lamel, Lori ; Gauvain, Jean-Luc

  • Author_Institution
    LNE (Nat. Metrol. & Testing Lab.), Trappes, France
  • fYear
    2013
  • Firstpage
    7309
  • Lastpage
    7313
  • Abstract
    This paper describes the development of a Latvian speech-to-text (STT) system at LIMSI within the Quaero project. One of the aims of the speech processing activities in the Quaero project is to cover all official European languages. However, for some of the languages only very limited, if any, training resources are available via corpora agencies such as LDC and ELRA. The aim of this study was to show the way, taking Latvian as example, an STT system can be rapidly developed without any transcribed training data. Following the scheme proposed in this paper, the Latvian STT system was developed in about a month and obtained a word error rate of 20% on broadcast news and conversation data in the Quaero 2012 evaluation campaign.
  • Keywords
    error statistics; speech synthesis; ELRA; LDC; LIMSI; Latvian STT system; Latvian speech-to-text system; Quaero 2012 evaluation campaign; broadcast news; conversation data; corpora agencies; official European languages; speech processing activities; training resources; word error rate; Acoustics; Artificial neural networks; Data models; Hidden Markov models; Speech; Speech recognition; Training; Latvian; Speech recognition; under-resourced language;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639082
  • Filename
    6639082