DocumentCode :
3342508
Title :
What kind of pronunciation variation is hard for triphones to model?
Author :
Jurafsky, Dun ; Ward, Wayne ; Zhang Banping ; Herold, Keith ; Xiuyang, Yu ; Sen, Zhang
Author_Institution :
Center for Spoken Language Res., Colorado Univ., Boulder, CO, USA
Volume :
1
fYear :
2001
fDate :
2001
Firstpage :
577
Abstract :
In order to help understand why gains in pronunciation modeling have proven so elusive, we investigated which kinds of pronunciation variation are well captured by triphone models, and which are not. We do this by examining the change in behavior of a recognizer as it receives further triphone training. We show that many of the kinds of variation which previous pronunciation models attempt to capture, including phone substitution or phone reduction, are in fact already well captured by triphones. Our analysis suggests new areas where future pronunciation models should focus, including syllable deletion
Keywords :
Viterbi decoding; natural languages; probability; speech recognition; speech recognition equipment; Viterbi decoding; human-to-human speech; phone reduction; phone substitution; pronunciation modelling; pronunciation variation; semi-continuous acoustic models; speaker-independent recognizer; speech recognition system; syllable deletion; tri-gram language models; triphone training; Context modeling; Decoding; Dictionaries; Error analysis; Error correction; Natural languages; Prototypes; Speech recognition; Training data; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
ISSN :
1520-6149
Print_ISBN :
0-7803-7041-4
Type :
conf
DOI :
10.1109/ICASSP.2001.940897
Filename :
940897
Link To Document :
بازگشت