مرکز منطقه ای اطلاع رساني علوم و فناوري - Blind stochastic feature transformation for speaker verification over cellular networks

DocumentCode :

3249403

Title :

Blind stochastic feature transformation for speaker verification over cellular networks

Author :

Yiu, Kwok-Kwong ; Mak, Man Wai ; Cheung, Ming-Cheung ; Kung, Sun-Yuan

Author_Institution :

Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Kowloon, China

fYear :

2004

fDate :

20-22 Oct. 2004

Firstpage :

679

Lastpage :

682

Abstract :

Acoustic mismatch between the training and recognition conditions presents one of the serious challenges faced by speaker recognition researchers today. The goal of channel compensation is to achieve performance approaching that of a "matched condition" system while avoiding the need for a large amount of training data. It is important to ensure that the channel compensation algorithms in these systems compensate the channel variation instead of speaker variation. This paper addresses the problem of unsupervised compensation in which the features of a test utterance are transformed to fit the clean speaker model and gender-dependent background model. Specifically, a feature-based transformation is estimated based on the statistical difference between a test utterance and a composite acoustic model formed by combining the speaker and background models. By transforming the features to fit both models, the transformation is implicitly constrained. Experimental results based on the 2001 NIST evaluation set show that the proposed transformation approach achieves significant improvement in both equal error rate and minimum detection cost as compared to cepstral mean subtraction, Znorm and short-time Gaussianization.

Keywords :

cellular radio; compensation; speaker recognition; statistical analysis; stochastic processes; MAP adaptation; blind stochastic feature transformation; cellular networks; channel robust speaker verification; channel variation; clean speaker model; composite GMM; composite acoustic model; gender-dependent background model; speaker models; speaker recognition accuracy; test utterance; training/recognition conditions acoustic mismatch; unsupervised channel compensation; Acoustic signal detection; Acoustic testing; Error analysis; Face recognition; Land mobile radio cellular systems; Loudspeakers; NIST; Speaker recognition; Stochastic processes; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Multimedia, Video and Speech Processing, 2004. Proceedings of 2004 International Symposium on

Print_ISBN :

0-7803-8687-6

Type :

conf

DOI :

10.1109/ISIMP.2004.1434155

Filename :

1434155

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3249403