An i-Vector PLDA based gender identification approach for severely distorted and multilingual DARPA RATS data

Author

Shivesh Ranjan;Gang Liu;John H. L. Hansen

Author_Institution

Center for Robust Speech Systems (CRSS) The University of Texas at Dallas, Richardson, TX, USA

fYear

2015

Firstpage

331

Lastpage

337

Abstract

This study proposes an i-Vector based approach to gender identification. Gender-labeled utterances from the Fisher English (FE) corpus are used to formulate an i-Vector extraction framework, and a Probabilistic Linear Discriminant Analysis (PLDA) back-end is employed to compute the scores for gender identification. A novel duration mismatch compensation strategy is also presented that offers very little degradation in identification accuracy even with a large reduction in the duration of the test-segment. The proposed method is shown to consistently outperform a GMM-UBM based gender-identification scheme on several test-sets created from a held-out portion of the FE corpus, and is able to achieve an identification accuracy of up to 97.63%. On the severely distorted and multilingual DARPA-RATS (Robust Automatic Transcription of Speech) corpora, the proposed approach achieves an identification accuracy of 76.48% using only the FE data in training. Next, a novel unsupervised domain adaptation strategy is also presented that utilizes only unlabeled RATS data to adapt the out-of-domain PLDA parameters derived from the FE training data. The strategy is able to offer a 6.8% relative improvement in identification accuracy, and a 14.75% relative reduction in Equal Error Rate (EER) compared to using the out-of-domain PLDA model on the RATS test-utterances. These improvements are significant since: 1) RATS test-utterances are severely distorted, 2) No labeled data of any kind is used for 4 of the 5 languages present in the test-utterances.

Keywords

"Adaptation models","Speech","Iron","Training","Rats","Data models","Computational modeling"

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on

Type

conf

DOI

10.1109/ASRU.2015.7404813

Filename

7404813