Title :
Rapid development of a Latvian speech-to-text system
Author :
Oparin, Ilya ; Lamel, Lori ; Gauvain, Jean-Luc
Author_Institution :
LNE (Nat. Metrol. & Testing Lab.), Trappes, France
Abstract :
This paper describes the development of a Latvian speech-to-text (STT) system at LIMSI within the Quaero project. One of the aims of the speech processing activities in the Quaero project is to cover all official European languages. However, for some of the languages only very limited, if any, training resources are available via corpora agencies such as LDC and ELRA. The aim of this study was to show the way, taking Latvian as example, an STT system can be rapidly developed without any transcribed training data. Following the scheme proposed in this paper, the Latvian STT system was developed in about a month and obtained a word error rate of 20% on broadcast news and conversation data in the Quaero 2012 evaluation campaign.
Keywords :
error statistics; speech synthesis; ELRA; LDC; LIMSI; Latvian STT system; Latvian speech-to-text system; Quaero 2012 evaluation campaign; broadcast news; conversation data; corpora agencies; official European languages; speech processing activities; training resources; word error rate; Acoustics; Artificial neural networks; Data models; Hidden Markov models; Speech; Speech recognition; Training; Latvian; Speech recognition; under-resourced language;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639082