DocumentCode :
1060261
Title :
A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification
Author :
Wu, Wei ; Zheng, Thomas Fang ; Xu, Ming-Xing ; Soong, Frank K.
Volume :
15
Issue :
6
fYear :
2007
Firstpage :
1893
Lastpage :
1903
Abstract :
Mismatch between enrollment and test data is one of the top performance degrading factors in speaker recognition applications. This mismatch is particularly true over public telephone networks, where input speech data is collected over different handsets and transmitted over different channels from one trial to the next. In this paper, a cohort-based speaker model synthesis (SMS) algorithm, designed for synthesizing robust speaker models without requiring channel-specific enrollment data, is proposed. This algorithm utilizes a priori knowledge of channels extracted from speaker-specific cohort sets to synthesize such speaker models. The cohort selection in the proposed new SMS can be either speaker-specific or Gaussian component based. Results on the China Criminal Police College (CCPC) speaker recognition corpus, which contains utterances from both landline and mobile channel, show the new algorithms yield significant speaker verification performance improvement over Htnorm and universal background model (UBM)-based speaker model synthesis.
Keywords :
Gaussian processes; speaker recognition; speech synthesis; telephone networks; Gaussian component; cohort-based speaker model synthesis; mismatched channels; public telephone networks; speaker recognition; speaker verification; Algorithm design and analysis; Data mining; Degradation; Network synthesis; Robustness; Speaker recognition; Speech synthesis; Telephone sets; Telephony; Testing; Channel mismatch; cohort; speaker model synthesis; speaker verification;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2007.899297
Filename :
4276767
Link To Document :
بازگشت