DocumentCode :
1849886
Title :
ASR for low-resourced languages: Building a phonetically balanced Romanian speech corpus
Author :
Stanescu, Miruna ; Cucu, H. ; Buzo, Andi ; Burileanu, C.
Author_Institution :
Univ. Politeh. of Bucharest, Bucharest, Romania
fYear :
2012
fDate :
27-31 Aug. 2012
Firstpage :
2060
Lastpage :
2064
Abstract :
The construction of automatic speech recognition (ASR) systems is fundamentally dependent on the speech corpus used to train the acoustic models. The speech corpus should be phonetically balanced to assure that the acoustic models are properly trained. This paper presents the design and development of the first phonetically balanced Romanian speech corpus. It describes all the language processing steps taken in order to obtain a proper set of phrases, discusses some important aspects regarding Romanian phonetics and emphasizes the phrase selection mechanism.
Keywords :
natural language processing; speech recognition; ASR systems; Romanian phonetics; acoustic models; automatic speech recognition systems; language processing steps; phonetically balanced Romanian speech corpus; phrase selection mechanism; Acoustics; Automatic speech recognition; Dictionaries; Robustness; Speech; ASR; corpora acquisition; corpora processing; diacritics restoration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
Conference_Location :
Bucharest
ISSN :
2219-5491
Print_ISBN :
978-1-4673-1068-0
Type :
conf
Filename :
6333974
Link To Document :
بازگشت