• DocumentCode
    1161233
  • Title

    Recursive likelihood evaluation and fast search algorithm for polynomial segment model with application to speech recognition

  • Author

    Li, Chak-Fai ; Siu, Man-Hung ; Au-Yeung, Jeff Siu-Kei

  • Author_Institution
    Electr. & Electron. Eng. Dept., Hong Kong Univ. of Sci. & Technol.
  • Volume
    14
  • Issue
    5
  • fYear
    2006
  • Firstpage
    1704
  • Lastpage
    1718
  • Abstract
    Polynomial segment models (PSMs), which are generalization of the hidden Markov models (HMMs), have opened an alternative research direction for speech recognition. However, they have been limited by their computational complexity. Traditionally, any change in PSM segment boundary requires likelihood recomputation of all the frames within the segment. This makes the PSM\´s segment likelihood evaluation an order of magnitude more expensive than the HMM\´s. Furthermore, because recognition using segment models needs to search over all possible segment boundaries, the PSM recognition is computationally unfeasible beyond N-best rescoring. By exploiting the properties of the time normalization in PSM, and by decomposing the PSM segment likelihood into a simple function of "sufficient statistics", in this paper, we show that segment likelihood can be evaluated efficiently in an order of computational complexity similar to HMM. In addition, by reformulating the PSM recognition as a search for the optimal path through a graph, this paper introduces a fast PSM search algorithm that intelligently prunes the number of hypothesized segment boundaries, such that PSM recognition can be performed in an order of complexity similar to HMM. We demonstrate the effectiveness of the proposed algorithms with experiments using a PSM-based recognition system on two different recognition tasks: TIDIGIT digit recognition and the Wall Street Journal dictation task. In both tasks, PSM recognition is feasible and out-performed traditional HMM by more than 14%
  • Keywords
    computational complexity; hidden Markov models; search problems; speech recognition; TIDIGIT digit recognition; Wall Street Journal dictation task; computational complexity; fast search algorithm; hidden Markov models; polynomial segment model; recursive likelihood evaluation; segment likelihood; speech recognition; time normalization; Acoustics; Computational complexity; Computational intelligence; Hidden Markov models; Parametric statistics; Polynomials; Speech processing; Speech recognition; Stochastic processes; Viterbi algorithm; Fast algorithm; polynomial segment model; search; speech recognition;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TSA.2005.858553
  • Filename
    1677990