Title :
Pronunciation variation speech recognition without dictionary modification on sparse database
Author :
Kanokphara, Supphanat ; Tesprasit, Virongrong ; Thongprasirt, Rachod
Author_Institution :
Inf. R&D Div., Nat. Electron. & Comput. Technol. Center (NECTEC), Pathumthani, Thailand
Abstract :
Generally, a speech recognition system uses a fixed set of pronunciations according to the dictionary for training and decoding. However, even a well-defined lexicon cannot be used to support all variations in human pronunciation. Besides, in order to cover all possible pronunciations, the size of the dictionary would be too large to implement. Sharing Gaussian densities across phonetic models and decision tree for pronunciation variation is proved to be efficient for a pronunciation variation system without dictionary modification. This paper presents the alternative methods that can be used even in the sparse database situation. Re-label training is modified to have rule-based pronunciation variation in order to obtain real phonetic acoustic models. Phonemic acoustic models are then retrained from the tying HMM states across phonetic models. These new phonemic models allow an alternative search path during recognition. The system shows better performance in the experiment.
Keywords :
Gaussian distribution; decision trees; hidden Markov models; search problems; speech processing; speech recognition; Gaussian densities; acoustic models; alternative search path; decision tree; decoding; performance; phonemic models; phonetic models; pronunciation variation speech recognition; re-label training; sparse database; training; tying HMM states; Databases; Decision trees; Decoding; Dictionaries; Hidden Markov models; Loudspeakers; Research and development; Speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
Print_ISBN :
0-7803-7663-3
DOI :
10.1109/ICASSP.2003.1198893