Title :
Combining Vocal Source and MFCC Features for Enhanced Speaker Recognition Performance Using GMMs
Author :
Hosseinzadeh, Danoush ; Krishnan, Sridhar
Author_Institution :
Ryerson Univ., Toronto
Abstract :
This work presents seven novel spectral features for speaker recognition. These features are the spectral centroid (SC), spectral bandwidth (SBW), spectral band energy (SBE), spectral crest factor (SCF), spectral flatness measure (SFM), Shannon entropy (SE) and Renyi entropy (RE). The proposed spectral features can quantify some of the characteristics of the vocal source or the excitation component of speech. This is useful for speaker recognition since vocal source information is known to be complementary to the vocal tract transfer function, which is usually obtained using the Mel frequency cepstral coefficients (MFCC) or linear predication cepstral coefficients (LPCC). To evaluate the performance of the spectral features, experiments were performed using a text-independent cohort Gaussian mixture model (GMM) speaker identification system. Based on 623 users from the TIMIT database, the spectral features achieved an identification accuracy of 99.33% when combined with the MFCC based features and when using undistorted speech. This represents a 4.03% improvement over the baseline system trained with only MFCC and DeltaMFCC features.
Keywords :
Gaussian processes; speaker recognition; Gaussian mixture model; Mel frequency cepstral coefficients; Renyi entropy; Shannon entropy; linear predication cepstral coefficients; speaker recognition performance enhancement; spectral band energy; spectral bandwidth; spectral centroid; spectral flatness measure; vocal source; Bandwidth; Cepstral analysis; Energy measurement; Entropy; Mel frequency cepstral coefficient; Performance evaluation; Spatial databases; Speaker recognition; Speech; Transfer functions;
Conference_Titel :
Multimedia Signal Processing, 2007. MMSP 2007. IEEE 9th Workshop on
Conference_Location :
Crete
Print_ISBN :
978-1-4244-1274-7
DOI :
10.1109/MMSP.2007.4412892