DocumentCode :
2701359
Title :
Model Complexity Selection and Cross-Validation EM Training for Robust Speaker Diarization
Author :
Anguera, Xavier ; Shinozaki, Tetsuo ; Woofers, C. ; Hernando, Juan
Author_Institution :
Int. Comput. Sci. Inst., Berkeley, CA, USA
Volume :
4
fYear :
2007
fDate :
15-20 April 2007
Abstract :
Accurate modeling of speaker clusters is important in the task of speaker diarization. Creating accurate models involves both selection of the model complexity and optimum training given the data. Using models with fixed complexity and trained using the standard EM algorithm poses a risk of overfitting, which can lead to a reduction in diarization performance. In this paper a technique proposed by the author to estimate the complexity of a model is combined with a novel training algorithm called "cross-validation EM" to control the number of training iterations. This combination leads to more robust speaker modeling and results in an increase in speaker diarization performance. Tests on the NIST RT (MDM) datasets for meetings show a relative improvement of 10.6% relative on the test set.
Keywords :
computational complexity; expectation-maximisation algorithm; pattern clustering; speaker recognition; cross-validation EM; cross-validation EM training; model complexity selection; robust speaker diarization; training iterations; Audio recording; Bayesian methods; Clustering algorithms; Computer science; Contracts; Iterative algorithms; Loudspeakers; NIST; Robustness; Testing; Speaker Diarization; complexity selection; cross-validation EM training; speaker segmentation and clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Type :
conf
DOI :
10.1109/ICASSP.2007.366902
Filename :
4218090
Link To Document :
بازگشت