Title :
The optimization of perceptually-based features for speaker identification
Author :
Xu, L. ; Oglesby, J. ; Mason, John S.
Author_Institution :
Dept. of Electr. & Electron. Eng., Univ. Coll., Swansea, UK
Abstract :
Results of an experimental study and the optimization of features for a conventional vector-quantization codebook-based automatic speaker identification (ASI) system are presented. Standard LPC (linear predictive coding) and a perceptually weighted feature termed PLP (perceptually based linear prediction) are compared using appropriate distance measures, namely, the log-likelihood, and three cepstral variants: constant weighting, the robot-power-sum, and the inverse variance. PLP features combined with a weighted cepstral measure are found to be consistently the best in a number of different digit-independent ASI experiments. Results support the hypothesis that the higher orders of PLP (>5) contain significant speaker-specific information, with ASI performance improving rapidly up to order 8, and then far more slowly yet consistently up to order 16. A similar pattern is seen for codebook size, with fast improvements up to size 64, with more gradual gains thereafter
Keywords :
speech recognition; cepstral variants; codebook-based; constant weighting; distance measures; inverse variance; linear predictive coding; log-likelihood; perceptually based linear prediction; perceptually-based features; robot-power-sum; speaker identification; speech recognition; vector-quantization; Automatic speech recognition; Cepstral analysis; Current measurement; Decision making; Educational institutions; Feature extraction; Linear predictive coding; Pattern matching; Speech recognition; Testing;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on
Conference_Location :
Glasgow
DOI :
10.1109/ICASSP.1989.266478