• DocumentCode
    2016679
  • Title

    Problems of modeling phone deletion in conversational speech for speech recognition

  • Author

    Mak, Brian ; Ko, Tom

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 3 2010
  • Firstpage
    114
  • Lastpage
    118
  • Abstract
    Recently we proposed a novel method to explicitly model the phone deletion phenomenon in speech, and introduced the context-dependent fragmented word model (CD-FWM). An evaluation on the WSJ1 Hub2 5K task shows that even in read speech, CD-FWM could reduce word error rate (WER) by a relative 10.3%. Since it is generally expected that the phone deletion phenomenon is more pronounced in conversational and spontaneous speech than in read speech, we extend our investigation of modeling phone deletion in conversation using CD-FWM on the SVitchboard 500-word task in this paper. To our surprise, much smaller recognition gain is obtained. Through a series of analyses, we present some plausible explanations for why phone deletion modeling is more successful in read speech than in conversational speech, and suggest future directions in improving CD-FWM for recognizing conversational speech.
  • Keywords
    speech recognition; CD-FWM; SVitchboard; context dependent fragmented word model; conversational speech; phone deletion phenomenon; speech recognition; word error rate; Acoustics; Analytical models; Conferences; Context modeling; Hidden Markov models; Speech; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-6244-5
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2010.5684839
  • Filename
    5684839