DocumentCode :
45770
Title :
An Investigation into Back-end Advancements for Speaker Recognition in Multi-Session and Noisy Enrollment Scenarios
Author :
Gang Liu ; Hansen, John H. L.
Author_Institution :
Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas, Richardson, TX, USA
Volume :
22
Issue :
12
fYear :
2014
fDate :
Dec. 2014
Firstpage :
1978
Lastpage :
1992
Abstract :
This study aims to explore the case of robust speaker recognition with multi-session enrollments and noise, with an emphasis on optimal organization and utilization of speaker information presented in the enrollment and development data. This study has two core objectives. First, we investigate more robust back-ends to address noisy multi-session enrollment data for speaker recognition. This task is achieved by proposing novel back-end algorithms. Second, we construct a highly discriminative speaker verification framework. This task is achieved through intrinsic and extrinsic back-end algorithm modification, resulting in complementary sub-systems. Evaluation of the proposed framework is performed on the NIST SRE2012 corpus. Results not only confirm individual sub-system advancements over an established baseline, the final grand fusion solution also represents a comprehensive overall advancement for the NIST SRE2012 core tasks. Compared with state-of-the-art SID systems on the NIST SRE2012, the novel parts of this study are: 1) exploring a more diverse set of solutions for low-dimensional i-Vector based modeling; and 2) diversifying the information configuration before modeling. All these two parts work together, resulting in very competitive performance with reasonable computational cost.
Keywords :
acoustic noise; speaker recognition; NIST SRE2012 corpus; SID systems; back-end advancements; discriminative speaker verification framework; extrinsic back-end algorithm; grand fusion solution; information configuration diversifying; intrinsic algorithm modification; low-dimensional i-Vector based modeling; multisession enrollments scenarios; noisy enrollment scenario; robust speaker recognition; Computational modeling; Covariance matrices; Noise measurement; Speaker recognition; Speech; Speech processing; Support vector machines; Classification algorithms; GCDS; PLDA; speaker recognition; universal background support;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
2329-9290
Type :
jour
DOI :
10.1109/TASLP.2014.2352154
Filename :
6883142
Link To Document :
بازگشت