DocumentCode
1689786
Title
Rapid development of a Latvian speech-to-text system
Author
Oparin, Ilya ; Lamel, Lori ; Gauvain, Jean-Luc
Author_Institution
LNE (Nat. Metrol. & Testing Lab.), Trappes, France
fYear
2013
Firstpage
7309
Lastpage
7313
Abstract
This paper describes the development of a Latvian speech-to-text (STT) system at LIMSI within the Quaero project. One of the aims of the speech processing activities in the Quaero project is to cover all official European languages. However, for some of the languages only very limited, if any, training resources are available via corpora agencies such as LDC and ELRA. The aim of this study was to show the way, taking Latvian as example, an STT system can be rapidly developed without any transcribed training data. Following the scheme proposed in this paper, the Latvian STT system was developed in about a month and obtained a word error rate of 20% on broadcast news and conversation data in the Quaero 2012 evaluation campaign.
Keywords
error statistics; speech synthesis; ELRA; LDC; LIMSI; Latvian STT system; Latvian speech-to-text system; Quaero 2012 evaluation campaign; broadcast news; conversation data; corpora agencies; official European languages; speech processing activities; training resources; word error rate; Acoustics; Artificial neural networks; Data models; Hidden Markov models; Speech; Speech recognition; Training; Latvian; Speech recognition; under-resourced language;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location
Vancouver, BC
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2013.6639082
Filename
6639082
Link To Document