DocumentCode
730758
Title
Combining SGMM speaker vectors and KL-HMM approach for speaker diarization
Author
Madikeri, Srikanth ; Motlicek, Petr ; Bourlard, Herve
Author_Institution
Idiap Res. Inst., Martigny, Switzerland
fYear
2015
fDate
19-24 April 2015
Firstpage
4834
Lastpage
4838
Abstract
In this paper, a method to use SGMM speaker vectors for speaker diarization is introduced. The architecture of the Information Bottleneck (IB) based speaker diarization is utilized for this purpose. The audio for speaker diarization is split into short uniform segments. Speaker vectors are obtained from a Subspace Gaussian Mixture Model (SGMM) system trained on meeting data. The speaker vectors are clustered using the K-means algorithm. Two types of distance measures are explored in the clustering step: cosine distance of the speaker vectors and that of the vectors in a space projected by Probabilistic Linear Discriminant Analysis (PLDA). The clustering output is used as an initialization step for the Kullback Leibler-Hidden Markov Model (KL-HMM) based speech segmentation approach commonly used in the IB system for diarization. The proposed method is compared to clustering the segments using the IB based approach. A relative improvement of approximately 14% is obtained on the diarization performance for the proposed approach using SGMM speaker vectors with PLDA on the NIST RT 09 dataset.
Keywords
Gaussian processes; hidden Markov models; mixture models; speaker recognition; K-means algorithm; KL-HMM approach; Kullback Leibler-hidden Markov model; NIST RT 09 dataset; PLDA; SGMM speaker vectors; cosine distance; distance measures; information bottleneck; probabilistic linear discriminant analysis; short uniform segments; speaker diarization; speech segmentation; subspace Gaussian mixture model; Clustering algorithms; Computational modeling; Computer architecture; Hidden Markov models; NIST; Speech; Speech processing; K-means; SGMM; speaker diarization; speaker vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178889
Filename
7178889
Link To Document