Title :
Robust speaker identification using an auditory-based feature
Author :
Li, Qi ; Huang, Yan
Author_Institution :
Li Creative Technol. (LcT), Inc., Florham Park, NJ, USA
Abstract :
An auditory-based feature extraction algorithm is presented. The feature is based on a recently published time-frequency transform plus a set of modules to simulate the signal processing functions in the cochlea. The feature is applied to a speaker identification task to address the acoustic mismatch problem between training and testing. Usually, the performances of acoustic models trained in clean speech drop significantly when tested on noisy speech. The proposed feature has shown strong robustness in the mismatched situation. As shown in our experiments, in a speaker identification task, both MFCC and the proposed feature have near perfect performances in a clean testing condition, but when the SNR of input signal drops to 6 dB, the average accuracy of the MFCC feature is only 41.2%, while the proposed feature still achieves an average accuracy of 88.3%.
Keywords :
Fourier transforms; acoustic signal processing; cepstral analysis; ear; feature extraction; hearing; speaker recognition; SNR; acoustic mismatch problem; auditory based feature extraction algorithm; mel frequency cepstral coefficient; noisy speech; robust speaker identification; signal processing function; signal-to-noise ratio; time frequency transform; Acoustic signal processing; Acoustic testing; Feature extraction; Loudspeakers; Mel frequency cepstral coefficient; Performance evaluation; Robustness; Signal processing algorithms; Speech; Time frequency analysis; Speech feature extraction; auditory-based feature; cochlea; robust speaker recognition; speaker identification;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495589