Title :
Robust speaker clustering strategies to data source variation for improved speaker diarization
Author :
Han, Kyu J. ; Kim, Samuel ; Narayanan, Shrikanth S.
Author_Institution :
Southern California Univ., Los Angeles
Abstract :
Agglomerative hierarchical clustering (AHC) has been widely used in speaker diarization systems to classify speech segments in a given data source by speaker identity, but is known to be not robust to data source variation. In this paper, we identify one of the key potential sources of this variability that negatively affects clustering error rate (CER), namely short speech segments, and propose three solutions to tackle this issue. Through experiments on various meeting conversation excerpts, the proposed methods are shown to outperform simple AHC in terms of relative CER improvements in the range of 17-32%.
Keywords :
error statistics; signal classification; speech processing; statistical analysis; agglomerative hierarchical clustering; clustering error rate; data source variation; speaker clustering; speaker diarization; speech classification; Change detection algorithms; Clustering algorithms; Error analysis; Feature extraction; Frequency; Laboratories; NIST; Robustness; Speech analysis; Statistical distributions; Speaker diarization; agglomerative hierarchical; clustering; clustering (AHC); data source variation; error rate (CER);
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
DOI :
10.1109/ASRU.2007.4430121