Combining SGMM speaker vectors and KL-HMM approach for speaker diarization

Author

Madikeri, Srikanth ; Motlicek, Petr ; Bourlard, Herve

Author_Institution

Idiap Res. Inst., Martigny, Switzerland

fYear

2015

fDate

19-24 April 2015

Firstpage

4834

Lastpage

4838

Abstract

In this paper, a method to use SGMM speaker vectors for speaker diarization is introduced. The architecture of the Information Bottleneck (IB) based speaker diarization is utilized for this purpose. The audio for speaker diarization is split into short uniform segments. Speaker vectors are obtained from a Subspace Gaussian Mixture Model (SGMM) system trained on meeting data. The speaker vectors are clustered using the K-means algorithm. Two types of distance measures are explored in the clustering step: cosine distance of the speaker vectors and that of the vectors in a space projected by Probabilistic Linear Discriminant Analysis (PLDA). The clustering output is used as an initialization step for the Kullback Leibler-Hidden Markov Model (KL-HMM) based speech segmentation approach commonly used in the IB system for diarization. The proposed method is compared to clustering the segments using the IB based approach. A relative improvement of approximately 14% is obtained on the diarization performance for the proposed approach using SGMM speaker vectors with PLDA on the NIST RT 09 dataset.

Keywords

Gaussian processes; hidden Markov models; mixture models; speaker recognition; K-means algorithm; KL-HMM approach; Kullback Leibler-hidden Markov model; NIST RT 09 dataset; PLDA; SGMM speaker vectors; cosine distance; distance measures; information bottleneck; probabilistic linear discriminant analysis; short uniform segments; speaker diarization; speech segmentation; subspace Gaussian mixture model; Clustering algorithms; Computational modeling; Computer architecture; Hidden Markov models; NIST; Speech; Speech processing; K-means; SGMM; speaker diarization; speaker vectors;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178889

Filename

7178889