DocumentCode :
2790333
Title :
Pashto speech recognition with limited pronunciation lexicon
Author :
Prasad, Rohit ; Tsakalidis, Stavros ; Bulyko, Ivan ; Kao, Chia-Lin ; Natarajan, Prem
Author_Institution :
BBN Technol., Cambridge, MA, USA
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
5086
Lastpage :
5089
Abstract :
Automatic speech recognition (ASR) for low resource languages continues to be a difficult problem. In particular, colloquial dialects of Arabic, Farsi, and Pashto pose significant challenges in pronunciation dictionary creation. Therefore, most state-of-the-art ASR engines rely on the grapheme-as-phoneme approach for creating pronunciation dictionaries in these languages. While the grapheme approach simplifies ASR training, it performs significantly worse than a system trained with a high-quality phonetic dictionary. In this paper, we explore two techniques for bridging the performance gap between the grapheme and the phonetic approaches, without requiring manual pronunciations for all the words in the training data. The first approach is based on learning letter-to-sound rules from a small set of manual pronunciations in Pashto, and the second approach uses a hybrid phoneme/grapheme representation for recognition. Through experimental results on colloquial Pashto, we demonstrate that both techniques perform as well as a full phonetic system while requiring manual pronunciations for only a small fraction of the words in the acoustic training data.
Keywords :
learning (artificial intelligence); natural language processing; speech processing; speech recognition; speech synthesis; Pashto speech recognition; automatic speech recognition; colloquial dialect; grapheme approach; letter to sound rule; phonetic dictionary; pronunciation lexicon; resource language; Automatic speech recognition; Decision trees; Dictionaries; Engines; Manuals; Natural languages; Speech recognition; Training data; Vocabulary; Writing; HMM; Pashto; decision tree; grapheme-as-phoneme; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495052
Filename :
5495052
Link To Document :
بازگشت