Title :
An improved speaker identification technique employing multiple representations of the linear prediction coefficients
Author :
Mikhael, W.B. ; Premakanthan, Pravinkumar
Author_Institution :
Sch. of Electr. Eng. & Comput. Sci., Central Florida Univ., Orlando, FL, USA
Abstract :
A novel Linear Prediction (LPC) based Automatic Speaker Identification (ASI) technique employing multiple representations of the LPC is presented. The proposed ASI system has two modes namely, the encoding mode, and the Speaker Identification (SI) mode. During the encoding mode, otherwise known as the training mode, the Linear Prediction Coefficients (LPC) are extracted for each speaker as speech features. Multiple Representation Split Vector Quantization (MRSVQ) is employed to form representative codebooks corresponding to each representation, for each speaker. During the SI (running) mode, the ASI system identifies the codebooks of the speaker in the database that best matches the LPC extracted from the speech signal of the unknown speaker. The synthesized all pole vocal tract transfer function is used as a measure of vocal tract for ASI. Employing the normalized vocal tract transfer function error measure, the proposed technique is consistently found to obtain enhanced ASI accuracy in comparison with vector quantization employing existing LPC representation, at the expense of a modest increase in computational complexity. Our ASI technique can be used in a stand-alone system or as part of an ASI environment.
Keywords :
linear predictive coding; signal representation; speaker recognition; speech coding; transfer functions; vector quantisation; ASI accuracy enhancement; LPC based automatic speaker identification; codebook identification; computational complexity; encoding mode; linear prediction coefficients; multiple representation split vector quantization; multiple representations; normalized vocal tract transfer function error measure; representative codebooks; speaker identification mode; speaker identification technique; synthesized all pole vocal tract transfer function; training mode; Computer science; Encoding; Hidden Markov models; Linear predictive coding; Resonant frequency; Signal processing; Spatial databases; Speech processing; Transfer functions; Vector quantization;
Conference_Titel :
Circuits and Systems, 2003. ISCAS '03. Proceedings of the 2003 International Symposium on
Print_ISBN :
0-7803-7761-3
DOI :
10.1109/ISCAS.2003.1206041