Title :
Speaker information from subband energies of Linear Prediction residual
Author :
Pati, Debadatta ; Prasanna, R. S M
Author_Institution :
Dept. of Electron. & Commun. Eng., Indian Inst. of Technol. Guwahati, Guwahati, India
Abstract :
The objective of this work is to demonstrate the significant speaker information present in the subband energies of the Linear Prediction (LP) residual. The LP residual mostly contains the excitation source information. The subband energies extracted using the mel filterbank followed by cepstral analysis provides a compact representation. The resulting cepstral values are termed as Residual-mel Frequency Cepstral Coefficients (R-MFCC). The speaker identification studies conducted using R-MFCC as features and Gaussian mixture model (GMM) on a subset of 30 speakers from NIST-1999 provides 87% accuracy. The performance using MFCC extracted directly from speech provides 87% accuracy. Further, the combination of the two provides 90% accuracy indicating the different aspect of speaker information present in R-MFCC.
Keywords :
Gaussian processes; cepstral analysis; filtering theory; speaker recognition; Gaussian mixture model; NIST; cepstral analysis; cepstral values; excitation source information; linear prediction residual; mel filterbank; residual-mel frequency cepstral coefficients; speaker identification; speaker information; subband energy; Cepstral analysis; Data mining; Discrete Fourier transforms; Filter bank; Information analysis; Mel frequency cepstral coefficient; Speaker recognition; Speech; Support vector machines; Testing;
Conference_Titel :
Communications (NCC), 2010 National Conference on
Conference_Location :
Chennai
Print_ISBN :
978-1-4244-6383-1
DOI :
10.1109/NCC.2010.5430209