DocumentCode :
3595715
Title :
Context dependent phonetic duration models for decoding conversational speech
Author :
Monkowski, Michael D. ; Picheny, Michael A. ; Rao, P. Srinivasa
Author_Institution :
Human Language Technol. Group, IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume :
1
fYear :
1995
Firstpage :
528
Abstract :
Conversational speech provides a particularly difficult task for speech recognition. It provides much more variability than either dictation, read speech, or isolated commands. Phonetic context was used to predict the durations of phones using a decision tree. These predictions were used to calculate context dependent HMM transition probabilities for these phone models, which were used to decode telephone conversations from the SwitchBoard corpus. We observed that the duration models do not appreciably improve the word error rate; that more can be gained by modeling phone durations within words than by adjusting for local average speaking rates; and conclude that local or global variations in speaking rate are not major contributors to the observed high error rates for SwitchBoard
Keywords :
decision theory; decoding; error statistics; hidden Markov models; prediction theory; probability; speech processing; speech recognition; telephony; trees (mathematics); SwitchBoard corpus; context dependent HMM transition probabilities; context dependent phonetic duration models; conversational speech decoding; decision tree; error rates; global variations; local average speaking rates; local variations; phonetic context; phonetic duration prediction; speech recognition; telephone conversations; word error rate; Context modeling; Decision trees; Decoding; Error analysis; Hidden Markov models; Probability; Shape measurement; Speech recognition; Telephony; Topology; Viterbi algorithm;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-2431-5
Type :
conf
DOI :
10.1109/ICASSP.1995.479645
Filename :
479645
Link To Document :
بازگشت