DocumentCode :
2016679
Title :
Problems of modeling phone deletion in conversational speech for speech recognition
Author :
Mak, Brian ; Ko, Tom
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
fYear :
2010
fDate :
Nov. 29 2010-Dec. 3 2010
Firstpage :
114
Lastpage :
118
Abstract :
Recently we proposed a novel method to explicitly model the phone deletion phenomenon in speech, and introduced the context-dependent fragmented word model (CD-FWM). An evaluation on the WSJ1 Hub2 5K task shows that even in read speech, CD-FWM could reduce word error rate (WER) by a relative 10.3%. Since it is generally expected that the phone deletion phenomenon is more pronounced in conversational and spontaneous speech than in read speech, we extend our investigation of modeling phone deletion in conversation using CD-FWM on the SVitchboard 500-word task in this paper. To our surprise, much smaller recognition gain is obtained. Through a series of analyses, we present some plausible explanations for why phone deletion modeling is more successful in read speech than in conversational speech, and suggest future directions in improving CD-FWM for recognizing conversational speech.
Keywords :
speech recognition; CD-FWM; SVitchboard; context dependent fragmented word model; conversational speech; phone deletion phenomenon; speech recognition; word error rate; Acoustics; Analytical models; Conferences; Context modeling; Hidden Markov models; Speech; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location :
Tainan
Print_ISBN :
978-1-4244-6244-5
Type :
conf
DOI :
10.1109/ISCSLP.2010.5684839
Filename :
5684839
Link To Document :
بازگشت