DocumentCode
1161233
Title
Recursive likelihood evaluation and fast search algorithm for polynomial segment model with application to speech recognition
Author
Li, Chak-Fai ; Siu, Man-Hung ; Au-Yeung, Jeff Siu-Kei
Author_Institution
Electr. & Electron. Eng. Dept., Hong Kong Univ. of Sci. & Technol.
Volume
14
Issue
5
fYear
2006
Firstpage
1704
Lastpage
1718
Abstract
Polynomial segment models (PSMs), which are generalization of the hidden Markov models (HMMs), have opened an alternative research direction for speech recognition. However, they have been limited by their computational complexity. Traditionally, any change in PSM segment boundary requires likelihood recomputation of all the frames within the segment. This makes the PSM\´s segment likelihood evaluation an order of magnitude more expensive than the HMM\´s. Furthermore, because recognition using segment models needs to search over all possible segment boundaries, the PSM recognition is computationally unfeasible beyond N-best rescoring. By exploiting the properties of the time normalization in PSM, and by decomposing the PSM segment likelihood into a simple function of "sufficient statistics", in this paper, we show that segment likelihood can be evaluated efficiently in an order of computational complexity similar to HMM. In addition, by reformulating the PSM recognition as a search for the optimal path through a graph, this paper introduces a fast PSM search algorithm that intelligently prunes the number of hypothesized segment boundaries, such that PSM recognition can be performed in an order of complexity similar to HMM. We demonstrate the effectiveness of the proposed algorithms with experiments using a PSM-based recognition system on two different recognition tasks: TIDIGIT digit recognition and the Wall Street Journal dictation task. In both tasks, PSM recognition is feasible and out-performed traditional HMM by more than 14%
Keywords
computational complexity; hidden Markov models; search problems; speech recognition; TIDIGIT digit recognition; Wall Street Journal dictation task; computational complexity; fast search algorithm; hidden Markov models; polynomial segment model; recursive likelihood evaluation; segment likelihood; speech recognition; time normalization; Acoustics; Computational complexity; Computational intelligence; Hidden Markov models; Parametric statistics; Polynomials; Speech processing; Speech recognition; Stochastic processes; Viterbi algorithm; Fast algorithm; polynomial segment model; search; speech recognition;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TSA.2005.858553
Filename
1677990
Link To Document