DocumentCode :
575627
Title :
Speaker diarization of broadcast audio using automatic transcription, iVectors and cosine distance scoring
Author :
Prazak, Jan ; Bohac, Marek
Author_Institution :
Inst. of Inf. Technol. & Electron., Tech. Univ. of Liberec, Liberec, Czech Republic
fYear :
2012
fDate :
12-14 Sept. 2012
Firstpage :
211
Lastpage :
214
Abstract :
In this paper we present our system for speaker diarization of broadcast audio. In the system, segmentation of the processed spoken document utilizes an automatic transcription and speech segments determined by the speaker change point detector are represented by iVectors. Similarity of speech segments is evaluated using cosine distance scoring and linear discriminant analysis is applied to cope with intra-speaker variability. We demonstrate improvement of the performance over the baseline system employing methods based on the Bayesian Information Criterion (BIC). The presented speaker diarization system achieved 39.2% relative improvement of the diarization error rate over the baseline.
Keywords :
speaker recognition; BIC; Bayesian information criterion; automatic transcription; broadcast audio; cosine distance scoring; diarization error rate; iVectors; intra-speaker variability; linear discriminant analysis; processed spoken document segmentation; speaker diarization system; speech segments; Databases; Detectors; Measurement; Speech; Speech processing; Speech recognition; Vectors; Broadcast audio; Segmentation; Speaker clustering; Speaker diarization; iVectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
ELMAR, 2012 Proceedings
Conference_Location :
Zadar
ISSN :
1334-2630
Print_ISBN :
978-1-4673-1243-1
Type :
conf
Filename :
6338508
Link To Document :
بازگشت