Large-scale speaker search using PLDA on mismatched conditions

Author

Ma, Jeff ; Silovsky, Jan ; Siu, Man-hung ; Kimball, Owen

Author_Institution

Raytheon BBN Technol., Cambridge, MA, USA

fYear

2015

fDate

19-24 April 2015

Firstpage

1846

Lastpage

1850

Abstract

Recent work reported on fast speaker search over large speech data corpora has focused on using locality sensitive hashing (LSH) search with hashing functions approximating i-vector based cosine distances (CosDist) for model comparisons. Because of the superior performance of probabilistic linear discriminant analysis (PLDA) model reported on speaker identification (SID) in recent years, in this paper we focus on using PLDA for fast speaker search. It is challenging to approximate PLDA well with simple hashing functions, resulting in difficulty to combine it with LSH search. As an alternative, we adopt a clustering-based pruning strategy to speed up PLDA search. Our results show the strategy can significantly speed up search with minimal performance loss. Another focus of this work is on PLDA model adaptation to mismatched conditions under which the fast search runs. The technique we adopt to adapt the PLDA model is based on the LDA adaptation method reported in [1], primarily adapting the LDA transform. Our results show this adaptation improves PLDA performance significantly (over 25% relative) on data collected in different conditions. Our speed-up experiments running with adapted LDA show that gains from the adapted PLDA are retained after the speed-up.

Keywords

speaker recognition; statistical analysis; LDA adaptation method; LDA transform; LSH search; PLDA search; SID; clustering-based pruning strategy; hashing functions; i-vector based cosine distances; large speech data corpora; large-scale fast speaker search; locality sensitive hashing search; probabilistic linear discriminant analysis model; speaker identification; Ports (Computers); Speech; Switches; Three-dimensional displays; I-vectors; PLDA; cosine distance; speaker search;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178290

Filename

7178290