DocumentCode :
2993597
Title :
An efficient vector-quantization preprocessor for speaker independent isolated word recognition
Author :
Pan, K.C. ; Soong, F.K. ; Rabiner, L.R. ; Bergh, A.F.
Author_Institution :
AT&T Bell Laboratories, Murray Hill, New Jersey
Volume :
10
fYear :
1985
fDate :
31138
Firstpage :
874
Lastpage :
877
Abstract :
Recently a new structure for isolated word recognition was proposed based on the ideas of vector quantization (VQ). In this scheme a separate VQ codebook, for each word in the vocabulary, was designed, based on a training sequence of several tokens of each word by one or more talkers. In the original implementation, the recognizer chose the word in the vocabulary whose average quantization distortion (according to its particular codebook) was minimum. In the proposed implementation, the word-based VQ´s are used as a front end preprocessor to eliminate word candidates whose distortion scores are large; a DTW processor then resolves the choice among the remaining word candidates (i.e. those which are passed on by the preprocessor). Both of the above schemes work very well for small vocabularies; however the major flaw is the lack of temporal information in the word-based VQ processor. As such, as the vocabulary for recognition grows in size and complexity, the ability of the VQ processor to resolve among similar sounding words decreases dramatically, and the effectiveness of the proposed recognition structure similarly decreases. To alleviate this difficulty a technique for incorporating temporal structure into the preprocessor is also proposed. In particular, the probability density function of the time of occurrence for each vector in the codebook is estimated from the same training sequence used to derive the codebook vectors. In the recognizer, the spectral distance score of the VQ is combined with a (scaled) temporal distance score, for each frame in the word. An evaluation of the proposed recognizer showed good performance on both the digits vocabulary, and on a vocabulary of 129 airlines terms.
Keywords :
Algorithm design and analysis; Autocorrelation; Desktop publishing; Digital signal processing; Linear predictive coding; Logic; Training data; Vector quantization; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '85.
Type :
conf
DOI :
10.1109/ICASSP.1985.1168317
Filename :
1168317
Link To Document :
بازگشت