• DocumentCode
    3744863
  • Title

    An i-Vector PLDA based gender identification approach for severely distorted and multilingual DARPA RATS data

  • Author

    Shivesh Ranjan;Gang Liu;John H. L. Hansen

  • Author_Institution
    Center for Robust Speech Systems (CRSS) The University of Texas at Dallas, Richardson, TX, USA
  • fYear
    2015
  • Firstpage
    331
  • Lastpage
    337
  • Abstract
    This study proposes an i-Vector based approach to gender identification. Gender-labeled utterances from the Fisher English (FE) corpus are used to formulate an i-Vector extraction framework, and a Probabilistic Linear Discriminant Analysis (PLDA) back-end is employed to compute the scores for gender identification. A novel duration mismatch compensation strategy is also presented that offers very little degradation in identification accuracy even with a large reduction in the duration of the test-segment. The proposed method is shown to consistently outperform a GMM-UBM based gender-identification scheme on several test-sets created from a held-out portion of the FE corpus, and is able to achieve an identification accuracy of up to 97.63%. On the severely distorted and multilingual DARPA-RATS (Robust Automatic Transcription of Speech) corpora, the proposed approach achieves an identification accuracy of 76.48% using only the FE data in training. Next, a novel unsupervised domain adaptation strategy is also presented that utilizes only unlabeled RATS data to adapt the out-of-domain PLDA parameters derived from the FE training data. The strategy is able to offer a 6.8% relative improvement in identification accuracy, and a 14.75% relative reduction in Equal Error Rate (EER) compared to using the out-of-domain PLDA model on the RATS test-utterances. These improvements are significant since: 1) RATS test-utterances are severely distorted, 2) No labeled data of any kind is used for 4 of the 5 languages present in the test-utterances.
  • Keywords
    "Adaptation models","Speech","Iron","Training","Rats","Data models","Computational modeling"
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
  • Type

    conf

  • DOI
    10.1109/ASRU.2015.7404813
  • Filename
    7404813