DocumentCode
730302
Title
Large-scale speaker search using PLDA on mismatched conditions
Author
Ma, Jeff ; Silovsky, Jan ; Siu, Man-hung ; Kimball, Owen
Author_Institution
Raytheon BBN Technol., Cambridge, MA, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
1846
Lastpage
1850
Abstract
Recent work reported on fast speaker search over large speech data corpora has focused on using locality sensitive hashing (LSH) search with hashing functions approximating i-vector based cosine distances (CosDist) for model comparisons. Because of the superior performance of probabilistic linear discriminant analysis (PLDA) model reported on speaker identification (SID) in recent years, in this paper we focus on using PLDA for fast speaker search. It is challenging to approximate PLDA well with simple hashing functions, resulting in difficulty to combine it with LSH search. As an alternative, we adopt a clustering-based pruning strategy to speed up PLDA search. Our results show the strategy can significantly speed up search with minimal performance loss. Another focus of this work is on PLDA model adaptation to mismatched conditions under which the fast search runs. The technique we adopt to adapt the PLDA model is based on the LDA adaptation method reported in [1], primarily adapting the LDA transform. Our results show this adaptation improves PLDA performance significantly (over 25% relative) on data collected in different conditions. Our speed-up experiments running with adapted LDA show that gains from the adapted PLDA are retained after the speed-up.
Keywords
speaker recognition; statistical analysis; LDA adaptation method; LDA transform; LSH search; PLDA search; SID; clustering-based pruning strategy; hashing functions; i-vector based cosine distances; large speech data corpora; large-scale fast speaker search; locality sensitive hashing search; probabilistic linear discriminant analysis model; speaker identification; Ports (Computers); Speech; Switches; Three-dimensional displays; I-vectors; PLDA; cosine distance; speaker search;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178290
Filename
7178290
Link To Document