DocumentCode
1849886
Title
ASR for low-resourced languages: Building a phonetically balanced Romanian speech corpus
Author
Stanescu, Miruna ; Cucu, H. ; Buzo, Andi ; Burileanu, C.
Author_Institution
Univ. Politeh. of Bucharest, Bucharest, Romania
fYear
2012
fDate
27-31 Aug. 2012
Firstpage
2060
Lastpage
2064
Abstract
The construction of automatic speech recognition (ASR) systems is fundamentally dependent on the speech corpus used to train the acoustic models. The speech corpus should be phonetically balanced to assure that the acoustic models are properly trained. This paper presents the design and development of the first phonetically balanced Romanian speech corpus. It describes all the language processing steps taken in order to obtain a proper set of phrases, discusses some important aspects regarding Romanian phonetics and emphasizes the phrase selection mechanism.
Keywords
natural language processing; speech recognition; ASR systems; Romanian phonetics; acoustic models; automatic speech recognition systems; language processing steps; phonetically balanced Romanian speech corpus; phrase selection mechanism; Acoustics; Automatic speech recognition; Dictionaries; Robustness; Speech; ASR; corpora acquisition; corpora processing; diacritics restoration;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
Conference_Location
Bucharest
ISSN
2219-5491
Print_ISBN
978-1-4673-1068-0
Type
conf
Filename
6333974
Link To Document