DocumentCode
2016679
Title
Problems of modeling phone deletion in conversational speech for speech recognition
Author
Mak, Brian ; Ko, Tom
Author_Institution
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
fYear
2010
fDate
Nov. 29 2010-Dec. 3 2010
Firstpage
114
Lastpage
118
Abstract
Recently we proposed a novel method to explicitly model the phone deletion phenomenon in speech, and introduced the context-dependent fragmented word model (CD-FWM). An evaluation on the WSJ1 Hub2 5K task shows that even in read speech, CD-FWM could reduce word error rate (WER) by a relative 10.3%. Since it is generally expected that the phone deletion phenomenon is more pronounced in conversational and spontaneous speech than in read speech, we extend our investigation of modeling phone deletion in conversation using CD-FWM on the SVitchboard 500-word task in this paper. To our surprise, much smaller recognition gain is obtained. Through a series of analyses, we present some plausible explanations for why phone deletion modeling is more successful in read speech than in conversational speech, and suggest future directions in improving CD-FWM for recognizing conversational speech.
Keywords
speech recognition; CD-FWM; SVitchboard; context dependent fragmented word model; conversational speech; phone deletion phenomenon; speech recognition; word error rate; Acoustics; Analytical models; Conferences; Context modeling; Hidden Markov models; Speech; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location
Tainan
Print_ISBN
978-1-4244-6244-5
Type
conf
DOI
10.1109/ISCSLP.2010.5684839
Filename
5684839
Link To Document