مرکز منطقه ای اطلاع رساني علوم و فناوري - On the use of i–vector posterior distributions in Probabilistic Linear Discriminant Analysis

DocumentCode :

5398

Title :

On the use of i–vector posterior distributions in Probabilistic Linear Discriminant Analysis

Author :

Cumani, Sandro ; Plchot, Oldrich ; Laface, Pietro

Author_Institution :

Dipt. di Autom. e Inf., Politec. di Torino, Turin, Italy

Volume :

Issue :

fYear :

2014

fDate :

Apr-14

Firstpage :

846

Lastpage :

857

Abstract :

The i-vector extraction process is affected by several factors such as the noise level, the acoustic content of the observed features, the channel mismatch between the training conditions and the test data, and the duration of the analyzed speech segment. These factors influence both the i-vector estimate and its uncertainty, represented by the i-vector posterior covariance. This paper presents a new PLDA model that, unlike the standard one, exploits the intrinsic i-vector uncertainty. Since the recognition accuracy is known to decrease for short speech segments, and their length is one of the main factors affecting the i-vector covariance, we designed a set of experiments aiming at comparing the standard and the new PLDA models on short speech cuts of variable duration, randomly extracted from the conversations included in the NIST SRE 2010 extended dataset, both from interviews and telephone conversations. Our results on NIST SRE 2010 evaluation data show that in different conditions the new model outperforms the standard PLDA by more than 10% relative when tested on short segments with duration mismatches, and is able to keep the accuracy of the standard model for long enough speaker segments. This technique has also been successfully tested in the NIST SRE 2012 evaluation.

Keywords :

speaker recognition; NIST SRE 2010 evaluation; PLDA model; acoustic content; channel mismatch; i-vector estimates; i-vector extraction process; i-vector posterior covariance; i-vector posterior distributions; intrinsic i-vector uncertainty; probabilistic linear discriminant analysis; recognition accuracy; short speech segments; speaker recognition; speech segment; Computational modeling; NIST; Speech; Speech processing; Speech recognition; Vectors; I-vector extraction; I-vectors; probabilistic linear discriminant analysis; speaker recognition;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE/ACM Transactions on

Publisher :

ieee

ISSN :

2329-9290

Type :

jour

DOI :

10.1109/TASLP.2014.2308473

Filename :

6748853

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=5398