• DocumentCode
    1849886
  • Title

    ASR for low-resourced languages: Building a phonetically balanced Romanian speech corpus

  • Author

    Stanescu, Miruna ; Cucu, H. ; Buzo, Andi ; Burileanu, C.

  • Author_Institution
    Univ. Politeh. of Bucharest, Bucharest, Romania
  • fYear
    2012
  • fDate
    27-31 Aug. 2012
  • Firstpage
    2060
  • Lastpage
    2064
  • Abstract
    The construction of automatic speech recognition (ASR) systems is fundamentally dependent on the speech corpus used to train the acoustic models. The speech corpus should be phonetically balanced to assure that the acoustic models are properly trained. This paper presents the design and development of the first phonetically balanced Romanian speech corpus. It describes all the language processing steps taken in order to obtain a proper set of phrases, discusses some important aspects regarding Romanian phonetics and emphasizes the phrase selection mechanism.
  • Keywords
    natural language processing; speech recognition; ASR systems; Romanian phonetics; acoustic models; automatic speech recognition systems; language processing steps; phonetically balanced Romanian speech corpus; phrase selection mechanism; Acoustics; Automatic speech recognition; Dictionaries; Robustness; Speech; ASR; corpora acquisition; corpora processing; diacritics restoration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
  • Conference_Location
    Bucharest
  • ISSN
    2219-5491
  • Print_ISBN
    978-1-4673-1068-0
  • Type

    conf

  • Filename
    6333974