DocumentCode :
2702365
Title :
Speaker Diarization: Towards a More Robust and Portable System
Author :
El Khoury, E. ; Senac, C. ; Andre-Obrecht, Regine
Author_Institution :
SAMoVA Team, CNRS UMR, Toulouse, France
Volume :
4
fYear :
2007
fDate :
15-20 April 2007
Abstract :
In this paper, we describe a new method for speaker segmentation and clustering of an audio document. For the segmentation phase, we combine the generalized likelihood ratio (GLR) and the Bayesian information criterion (BIC) in a way that avoids most of the parameters tuning. For the clustering phase, we use an existing approach that utilizes the eigen vector space model (EVSM) with a bottom-up hierarchical grouping but we make some improvements by introducing prosodic information. Evaluation is done on the audio database of the ESTER evaluation campaign for the rich transcription of French Broadcast news. Results show that our method which operates without any a priori knowledge about speakers is suitable for speaker diarization as it outperforms the traditional ones with an overall diarization error rate (DER) of 16.72%.
Keywords :
Bayes methods; eigenvalues and eigenfunctions; speech processing; Bayesian information criterion; ESTER evaluation campaign; French Broadcast news; audio document; diarization error rate; eigen vector space model; generalized likelihood ratio; speaker clustering; speaker diarization; speaker segmentation; Audio databases; Bayesian methods; Broadcasting; Density estimation robust algorithm; Error analysis; Indexing; Phase detection; Robustness; Speech enhancement; Speech processing; Bayesian Information Criterion; Eigen Vector Space Model; F0 Feature; Generalized Likelihood Ratio; Speaker Diarization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
ISSN :
1520-6149
Print_ISBN :
1-4244-0727-3
Type :
conf
DOI :
10.1109/ICASSP.2007.366956
Filename :
4218144
Link To Document :
بازگشت