Title :
Speaker change point detection using deep neural nets
Author_Institution :
Centre de Rech. Inf. de Montreal (CRIM), Montréal, QC, Canada
Abstract :
We investigate the use of deep neural nets (DNN) to provide initial speaker change points in a speaker diarization system. The DNN trains states that correspond to the location of the speaker change point (SCP) in the speech segment input to the DNN. We model these different speaker change point locations in the DNN input by 10 to 20 states. The confidence in the SCP is measured by the number of frame synchronous states that correspond to the hypothesized speaker change point. We only keep the speaker change points with the highest confidence. We show that this DNN-based change point detector reduces the number of missed change points for both an English test set and a French dev set. We also show that the DNN-based change points reduce the diarization error rate for both an English and a French diarization system. These results show the feasibility of DNNs to provide initial speaker change points.
Keywords :
natural language processing; neural nets; speaker recognition; DNN; deep neural nets; diarization error rate; english diarization system; english test set; frame synchronous states; french dev set; french diarization system; hypothesized speaker change point; missed change points; speaker change point detection; speaker diarization system; speech segment input; Density estimation robust algorithm; Detectors; Error analysis; Measurement; Speech; Training; Training data; DNN; Deep Neural Networks; change point detection; speaker diarization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178806