Title :
A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification
Author :
Wu, Wei ; Zheng, Thomas Fang ; Xu, Ming-Xing ; Soong, Frank K.
Abstract :
Mismatch between enrollment and test data is one of the top performance degrading factors in speaker recognition applications. This mismatch is particularly true over public telephone networks, where input speech data is collected over different handsets and transmitted over different channels from one trial to the next. In this paper, a cohort-based speaker model synthesis (SMS) algorithm, designed for synthesizing robust speaker models without requiring channel-specific enrollment data, is proposed. This algorithm utilizes a priori knowledge of channels extracted from speaker-specific cohort sets to synthesize such speaker models. The cohort selection in the proposed new SMS can be either speaker-specific or Gaussian component based. Results on the China Criminal Police College (CCPC) speaker recognition corpus, which contains utterances from both landline and mobile channel, show the new algorithms yield significant speaker verification performance improvement over Htnorm and universal background model (UBM)-based speaker model synthesis.
Keywords :
Gaussian processes; speaker recognition; speech synthesis; telephone networks; Gaussian component; cohort-based speaker model synthesis; mismatched channels; public telephone networks; speaker recognition; speaker verification; Algorithm design and analysis; Data mining; Degradation; Network synthesis; Robustness; Speaker recognition; Speech synthesis; Telephone sets; Telephony; Testing; Channel mismatch; cohort; speaker model synthesis; speaker verification;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2007.899297