DocumentCode :
3422636
Title :
GMM/SVM N-best speaker identification under mismatch channel conditions
Author :
Zeljkovic, Ilija ; Haffner, Patrick ; Amento, Brian ; Wilpon, Jay
Author_Institution :
AT&T Labs.-Res., Florham Park, NJ
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
4129
Lastpage :
4132
Abstract :
Under severe channel mismatch conditions, such as training with far-field speech and testing with telephone data, performance of speaker identification (SID) degrades significantly, often below practical use. But for many SID tasks, it is sufficient to recognize an N-best list of speakers for further human analysis. We investigate N-best SID accuracy for matched (telephone/telephone) and mismatched (far-field/telephone) train/test channel conditions. Using an SVM-GMM supervector (GSV), pitch and formant frequency histograms (PFH) and cross-channel adaptation using cohorts, we reduced matched channel error rate by over 25% relative to the baseline (GMM-UBM), for top-1, and achieved mismatched N-best accuracy comparable to the baseline.
Keywords :
Gaussian processes; speaker recognition; support vector machines; cross-channel adaptation; formant frequency histograms; mismatch channel conditions; speaker identification; support vector machines; telephone data; Degradation; Histograms; Humans; Internet telephony; Microphones; Robustness; Speech analysis; Strontium; Support vector machines; Testing; GMM; SVM; Speaker identification; cohort speaker adaptation; formants;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518563
Filename :
4518563
Link To Document :
بازگشت