Title :
Multi-taper MFCC features for speaker verification using I-vectors
Author :
Alam, Md Jahangir ; Kinnunen, Tomi ; Kenny, Patrick ; Ouellet, Pierre ; O´Shaughnessy, Douglas
Author_Institution :
CRIM, Montreal, QC, Canada
Abstract :
This paper studies the low-variance multi-taper mel-frequency cepstral coefficient (MFCC) features in the state-of-the-art speaker verification. The MFCC features are usually computed using a Hamming-windowed DFT spectrum. Windowing reduces the bias of the spectrum but variance remains high. Recently, low-variance multi-taper MFCC features were studied in speaker verification with promising preliminary results on the NIST 2002 SRE data using a simple GMM-UBM recognizer. In this study our goal is to validate those findings using a up-to-date i-vector classifier on the latest NIST 2010 SRE data. Our experiment on the telephone (det5) and microphone speech (det1, det2, det3 and det4) indicate that the multi-taper approaches perform better than the conventional Hamming window technique.
Keywords :
discrete Fourier transforms; microphones; speaker recognition; GMM-UBM recognizer; Hamming-windowed DFT spectrum; NIST 2002 SRE data; conventional Hamming window technique; microphone speech; multitaper MFCC; multitaper mel-frequency cepstral coefficient; speaker verification; telephone speech; up-to-date i-vector classifier; Data mining; Feature extraction; Mel frequency cepstral coefficient; NIST; Speaker recognition; Speech; Training;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163886