DocumentCode :
294646
Title :
Duration modeling in large vocabulary speech recognition
Author :
Anastasakos, Anastasios ; Schwartz, Richard ; Shu, Han
Author_Institution :
Northeastern Univ., Boston, MA, USA
Volume :
1
fYear :
1995
fDate :
9-12 May 1995
Firstpage :
628
Abstract :
This paper presents a study of different methods for phoneme duration modeling in large vocabulary speech recognition. We investigate the employment of phoneme duration and the effect of context, speaking rate and lexical stress in the duration of phoneme segments in a large vocabulary speech recognition system. The duration models are used in a postprocessing phase of BYBLOS, our baseline HMM-based recognition system, to rescore the N-Best hypotheses. We describe experiments with the 5 K word ARPA Wall Street Journal (WSJ) corpus. The results show that integration of duration models that take into account context and speaking rate can improve the word accuracy of the baseline recognition system
Keywords :
hidden Markov models; speech recognition; ARPA Wall Street Journal corpus; BYBLOS; N-Best hypotheses; baseline HMM-based recognition system; context; duration modeling; large vocabulary speech recognition; lexical stress; phoneme; postprocessing phase; speaking rate; word accuracy; Context modeling; Costs; Density functional theory; Employment; Hidden Markov models; Signal analysis; Speech analysis; Speech recognition; Stress; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
ISSN :
1520-6149
Print_ISBN :
0-7803-2431-5
Type :
conf
DOI :
10.1109/ICASSP.1995.479676
Filename :
479676
Link To Document :
بازگشت