DocumentCode :
3528192
Title :
Effective metric-based speaker segmentation in the frequency domain
Author :
Boehm, Christoph ; Pernkopf, Franz
Author_Institution :
Signal Process. & Speech Commun. Lab., Graz Univ. of Technol., Graz
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
4081
Lastpage :
4084
Abstract :
In this paper, we present an approach, called FREQDIST, for speaker segmentation based on a distance measurement applied in the frequency domain. To enhance the detection performance, the spectrum is reweighted using normalization techniques. Additionally, noise-like (i.e. flat) spectra are removed based on the entropy. Experiments using the TIMIT database [1] and Westdeutscher Rundfunk broadcast data show that our segmentation approach yields a good performance compared to the DISTBIC algorithm [2]. In particular, for the TIMIT data our algorithm reaches a false alarm rate (FAR) less than half of the value of the DISTBIC algorithm and a missed detection rate (MDR) of 7.0% instead of 13.1%.
Keywords :
frequency-domain analysis; speaker recognition; false alarm rate; frequency domain; metric-based speaker segmentation; normalization techniques; Broadcasting; Distance measurement; Feature extraction; Frequency domain analysis; Loudspeakers; Mel frequency cepstral coefficient; Resonant frequency; Signal processing; Signal processing algorithms; Speech; DISTBIC; FREQDIST; Speaker turn detection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960525
Filename :
4960525
Link To Document :
بازگشت