DocumentCode
1060261
Title
A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification
Author
Wu, Wei ; Zheng, Thomas Fang ; Xu, Ming-Xing ; Soong, Frank K.
Volume
15
Issue
6
fYear
2007
Firstpage
1893
Lastpage
1903
Abstract
Mismatch between enrollment and test data is one of the top performance degrading factors in speaker recognition applications. This mismatch is particularly true over public telephone networks, where input speech data is collected over different handsets and transmitted over different channels from one trial to the next. In this paper, a cohort-based speaker model synthesis (SMS) algorithm, designed for synthesizing robust speaker models without requiring channel-specific enrollment data, is proposed. This algorithm utilizes a priori knowledge of channels extracted from speaker-specific cohort sets to synthesize such speaker models. The cohort selection in the proposed new SMS can be either speaker-specific or Gaussian component based. Results on the China Criminal Police College (CCPC) speaker recognition corpus, which contains utterances from both landline and mobile channel, show the new algorithms yield significant speaker verification performance improvement over Htnorm and universal background model (UBM)-based speaker model synthesis.
Keywords
Gaussian processes; speaker recognition; speech synthesis; telephone networks; Gaussian component; cohort-based speaker model synthesis; mismatched channels; public telephone networks; speaker recognition; speaker verification; Algorithm design and analysis; Data mining; Degradation; Network synthesis; Robustness; Speaker recognition; Speech synthesis; Telephone sets; Telephony; Testing; Channel mismatch; cohort; speaker model synthesis; speaker verification;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2007.899297
Filename
4276767
Link To Document