مرکز منطقه ای اطلاع رساني علوم و فناوري - The UMD-JHU 2011 speaker recognition system

DocumentCode :

3162635

Title :

The UMD-JHU 2011 speaker recognition system

Author :

Garcia-Romero, D. ; Zhou, X. ; Zotkin, D. ; Srinivasan, B. ; Luo, Y. ; Ganapathy, S. ; Thomas, S. ; Nemala, S. ; Sivaram, GSVS ; Mirbagheri, M. ; Mallidi, SH ; Janu, T. ; Rajan, P. ; Mesgarani, N. ; Elhilali, M. ; Hermansky, H. ; Shamma, S. ; Duraiswami,

Author_Institution :

Univ. of Maryland, College Park, MD, USA

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4229

Lastpage :

4232

Abstract :

In recent years, there have been significant advances in the field of speaker recognition that has resulted in very robust recognition systems. The primary focus of many recent developments have shifted to the problem of recognizing speakers in adverse conditions, e.g in the presence of noise/reverberation. In this paper, we present the UMD-JHU speaker recognition system applied on the NIST 2010 SRE task. The novel aspects of our systems are: 1) Improved performance on trials involving different vocal effort via the use of linear-scale features; 2) Expected improved recognition performance in the presence of reverberation and noise via the use of frequency domain perceptual linear predictor and cortical features; 3) A new discriminative kernel partial least squares (KPLS) framework that complements state-of-the-art back-end systems JFA and PLDA to aid in better overall recognition; and 4) Acceleration of JFA, PLDA and KPLS back-ends via distributed computing. The individual components of the system and the fused system are compared against a baseline JFA system and results reported by SRI and MIT-LL on SRE2010.

Keywords :

frequency-domain analysis; least squares approximations; probability; speaker recognition; MIT-LL; NIST 2010 SRE task; PLDA; SRI; UMD-JHU speaker recognition system; baseline JFA system; discriminative KPLS framework; discriminative kernel partial least squares framework; distributed computing; frequency domain cortical features; frequency domain perceptual linear predictor; joint factor analysis; linear-scale features; noise-reverberation; probabilistic linear discriminant analysis; robust recognition systems; Kernel; Mel frequency cepstral coefficient; Noise; Reverberation; Robustness; Speaker recognition; Speech; Cortical; FDLP; JFA; KPLS; LFCC; NIST SRE 2010; PLDA; Speaker recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6288852

Filename :

6288852

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3162635