Title :
Calibration and multiple system fusion for spoken term detection using linear logistic regression
Author :
van Hout, Julien ; Ferrer, Luciana ; Vergyri, Dimitra ; Scheffer, Nicolas ; Lei, Yunwen ; Mitra, Ved ; Wegmann, Steven
Author_Institution :
SRI Int., Menlo Park, CA, USA
Abstract :
State-of-the-art calibration and fusion approaches for spoken term detection (STD) systems currently rely on a multi-pass approach where the scores are calibrated, then fused, and finally re-calibrated to obtain a single decision threshold across keywords. While the above techniques are theoretically correct, they rely on meta-parameter tuning and are prone to over-fitting. This study presents an efficient and effective score calibration technique for keyword detection that is based on the logistic regression calibration approach commonly used in forensic speaker identification. The technique applies seamlessly to both single systems and to system fusion, and enables optimization for specific keyword detection evaluation functions. We run experiments on a Vietnamese STD task, comparing the technique with more empirical calibration and fusion schemes and demonstrate that we can achieve comparable or better performance in terms of the NIST ATWV metric with a more elegant solution.
Keywords :
calibration; optimisation; regression analysis; speaker recognition; Vietnamese STD task; forensic speaker identification; keyword detection; linear logistic regression; logistic regression calibration; meta-parameter tuning; multipass approach; multiple system fusion; optimization; over-fitting; single decision threshold; spoken term detection; state-of-the-art calibration; Calibration; Hidden Markov models; Logistics; NIST; Speech; Training data; Vectors; score calibration; score normalization; spoken term detection; system fusion;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854985