DocumentCode
730687
Title
Speaker change point detection using deep neural nets
Author
Gupta, Vishwa
Author_Institution
Centre de Rech. Inf. de Montreal (CRIM), Montréal, QC, Canada
fYear
2015
fDate
19-24 April 2015
Firstpage
4420
Lastpage
4424
Abstract
We investigate the use of deep neural nets (DNN) to provide initial speaker change points in a speaker diarization system. The DNN trains states that correspond to the location of the speaker change point (SCP) in the speech segment input to the DNN. We model these different speaker change point locations in the DNN input by 10 to 20 states. The confidence in the SCP is measured by the number of frame synchronous states that correspond to the hypothesized speaker change point. We only keep the speaker change points with the highest confidence. We show that this DNN-based change point detector reduces the number of missed change points for both an English test set and a French dev set. We also show that the DNN-based change points reduce the diarization error rate for both an English and a French diarization system. These results show the feasibility of DNNs to provide initial speaker change points.
Keywords
natural language processing; neural nets; speaker recognition; DNN; deep neural nets; diarization error rate; english diarization system; english test set; frame synchronous states; french dev set; french diarization system; hypothesized speaker change point; missed change points; speaker change point detection; speaker diarization system; speech segment input; Density estimation robust algorithm; Detectors; Error analysis; Measurement; Speech; Training; Training data; DNN; Deep Neural Networks; change point detection; speaker diarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178806
Filename
7178806
Link To Document