DocumentCode
2790075
Title
Approaches to automatic lexicon learning with limited training examples
Author
Goel, Nagendra ; Thomas, Samuel ; Agarwal, Mohit ; Akyazi, Pinar ; Burget, Lukas ; Feng, Kai ; Ghoshal, Arnab ; Glembek, Ondrej ; Karafiát, Martin ; Povey, Daniel ; Rastrow, Ariya ; Rose, Richard C. ; Schwarz, Petr
Author_Institution
Go-Vivace Inc., McLean, VA, USA
fYear
2010
fDate
14-19 March 2010
Firstpage
5094
Lastpage
5097
Abstract
Preparation of a lexicon for speech recognition systems can be a significant effort in languages where the written form is not exactly phonetic. On the other hand, in languages where the written form is quite phonetic, some common words are often mispronounced. In this paper, we use a combination of lexicon learning techniques to explore whether a lexicon can be learned when only a small lexicon is available for boot-strapping. We discover that for a phonetic language such as Spanish, it is possible to do that better than what is possible from generic rules or hand-crafted pronunciations. For a more complex language such as English, we find that it is still possible but with some loss of accuracy.
Keywords
learning (artificial intelligence); natural language processing; speech recognition; automatic lexicon learning technique; bootstrapping; phonetic language; speech recognition systems; Acoustics; Automatic speech recognition; Costs; Dictionaries; Humans; Loudspeakers; Natural languages; Speech recognition; Training data; Vocabulary; LVCSR; Lexicon Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5495037
Filename
5495037
Link To Document