• DocumentCode
    1758510
  • Title

    SNR-Invariant PLDA Modeling in Nonparametric Subspace for Robust Speaker Verification

  • Author

    Na Li ; Man-Wai Mak

  • Author_Institution
    Dept. of Electron. & Inf. Eng., Hong Kong Polytech. Univ., Hong Kong, China
  • Volume
    23
  • Issue
    10
  • fYear
    2015
  • fDate
    Oct. 2015
  • Firstpage
    1648
  • Lastpage
    1659
  • Abstract
    While i-vector/PLDA framework has achieved great success, its performance still degrades dramatically under noisy conditions. To compensate for the variability of i-vectors caused by different levels of background noise, this paper proposes an SNR-invariant PLDA framework for robust speaker verification. First, nonparametric feature analysis (NFA) is employed to suppress intra-speaker variation and emphasize the discriminative information inherited in the boundaries between speakers in the i-vector space. Then, in the NFA-projected subspace, SNR-invariant PLDA is applied to separate the SNR-specific information from speaker-specific information using an identity factor and an SNR factor. Accordingly, a projected i-vector in the NFA subspace can be represented as a linear combination of three components: speaker, SNR, and channel. During verification, the variability due to SNR and channels are integrated out when computing the marginal likelihood ratio. Experiments based on NIST 2012 SRE show that the proposed framework achieves superior performance when compared with the conventional PLDA and SNR-dependent mixture of PLDA.
  • Keywords
    probability; speaker recognition; NFA subspace; NIST 2012 SRE; SNR-invariant PLDA modeling; SNR-specific information; discriminative information; i-vector-PLDA framework; intra-speaker variation; linear discriminant analysis; marginal likelihood ratio; nonparametric feature analysis; nonparametric subspace; probabilistic linear discriminant analysis; robust speaker verification; speaker-specific information; Covariance matrices; IEEE transactions; Robustness; Signal to noise ratio; Speech; Speech processing; Training; SNR-invariant; i-vector; nonparametric feature analysis; probabilistic linear discriminant analysis (PLDA); speaker verification;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2442757
  • Filename
    7120100