S-vector: A discriminative representation derived from i-vector for speaker verification

Author

Yusuf Ziya Işik;Hakan Erdogan;Ruhi Sarikaya

Author_Institution

UBITAK BILGEM, Gebze, Turkey

fYear

2015

Firstpage

2097

Lastpage

2101

Abstract

Representing data in ways to disentangle and factor out hidden dependencies is a critical step in speaker recognition systems. In this work, we employ deep neural networks (DNN) as a feature extractor to disentangle and emphasize the speaker factors from other sources of variability in the commonly used i-vector features. Denoising autoencoder based unsupervised pre-training, random dropout fine-tuning, and Nesterov accelerated gradient based momentum is used in DNN training. Replacing the i-vectors with the resulting speaker vectors (s-vectors), we obtain superior results on NIST SRE corpora on a wide range of operating points using probabilistic linear discriminant analysis (PLDA) back-end.

Keywords

"Training","Neural networks","Noise reduction","NIST","Feature extraction","Robustness","Noise measurement"

Publisher

ieee

Conference_Titel

Signal Processing Conference (EUSIPCO), 2015 23rd European

Electronic_ISBN

2076-1465

Type

conf

DOI

10.1109/EUSIPCO.2015.7362754

Filename

7362754

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3716203