Title :
A detection approach to search-space reduction for HMM state alignment in speaker verification
Author_Institution :
Lucent Technol., Bell Labs., Murray Hill, NJ, USA
fDate :
7/1/2001 12:00:00 AM
Abstract :
To support speaker verification (SV) in portable devices and in telephone servers with millions of users, a fast algorithm for hidden Markov model (HMM) alignment is necessary. Currently, the most popular algorithm is the Viterbi (1967) algorithm with beam search to reduce search-space; however, it is difficult to determine a suitable beam width beforehand. A small beam width may miss the optimal path while a large one may slow down the alignment. To address the problem, we propose a nonheuristic approach to reduce the search-space. Following the definition of the left-to-right HMM, we first detect the possible change-points between HMM states in a forward-and-backward scheme, then use the change-points to enclose a subspace for searching. The Viterbi algorithm or any other search algorithm can then be applied to the subspace to find the optimal state alignment. Compared to a full-search algorithm, the proposed algorithm is about four times faster, while the accuracy is still slightly better in an SV task; compared to the beam search algorithm, the proposed algorithm can provide better accuracy with even lower complexity. In short, for an HMM with S states, the computational complexity can be reduced up to a factor of S/3 with slightly better accuracy than in a full-search approach. This paper also discusses how to extend the change-point detection approach to large-vocabulary continuous speech recognition
Keywords :
Viterbi detection; computational complexity; hidden Markov models; search problems; speaker recognition; HMM state alignment; Viterbi algorithm; beam search algorithm; change-point detection; computational complexity reduction; fast algorithm; forward-and-backward scheme; full-search algorithm; hidden Markov model; large-vocabulary continuous speech recognition; left-to-right HMM; nonheuristic approach; optimal state alignment; portable devices; search-space reduction; speaker verification; telephone servers; Automatic speech recognition; Computational complexity; Costs; Hidden Markov models; Maximum likelihood decoding; Routing; Speaker recognition; Speech recognition; Telephony; Viterbi algorithm;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on