DocumentCode
1502206
Title
A detection approach to search-space reduction for HMM state alignment in speaker verification
Author
Li, Qi Peter
Author_Institution
Lucent Technol., Bell Labs., Murray Hill, NJ, USA
Volume
9
Issue
5
fYear
2001
fDate
7/1/2001 12:00:00 AM
Firstpage
569
Lastpage
578
Abstract
To support speaker verification (SV) in portable devices and in telephone servers with millions of users, a fast algorithm for hidden Markov model (HMM) alignment is necessary. Currently, the most popular algorithm is the Viterbi (1967) algorithm with beam search to reduce search-space; however, it is difficult to determine a suitable beam width beforehand. A small beam width may miss the optimal path while a large one may slow down the alignment. To address the problem, we propose a nonheuristic approach to reduce the search-space. Following the definition of the left-to-right HMM, we first detect the possible change-points between HMM states in a forward-and-backward scheme, then use the change-points to enclose a subspace for searching. The Viterbi algorithm or any other search algorithm can then be applied to the subspace to find the optimal state alignment. Compared to a full-search algorithm, the proposed algorithm is about four times faster, while the accuracy is still slightly better in an SV task; compared to the beam search algorithm, the proposed algorithm can provide better accuracy with even lower complexity. In short, for an HMM with S states, the computational complexity can be reduced up to a factor of S/3 with slightly better accuracy than in a full-search approach. This paper also discusses how to extend the change-point detection approach to large-vocabulary continuous speech recognition
Keywords
Viterbi detection; computational complexity; hidden Markov models; search problems; speaker recognition; HMM state alignment; Viterbi algorithm; beam search algorithm; change-point detection; computational complexity reduction; fast algorithm; forward-and-backward scheme; full-search algorithm; hidden Markov model; large-vocabulary continuous speech recognition; left-to-right HMM; nonheuristic approach; optimal state alignment; portable devices; search-space reduction; speaker verification; telephone servers; Automatic speech recognition; Computational complexity; Costs; Hidden Markov models; Maximum likelihood decoding; Routing; Speaker recognition; Speech recognition; Telephony; Viterbi algorithm;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.928921
Filename
928921
Link To Document