A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification

Author

Wu, Wei ; Zheng, Thomas Fang ; Xu, Ming-Xing ; Soong, Frank K.

Volume

15

Issue

6

fYear

2007

Firstpage

1893

Lastpage

1903

Abstract

Mismatch between enrollment and test data is one of the top performance degrading factors in speaker recognition applications. This mismatch is particularly true over public telephone networks, where input speech data is collected over different handsets and transmitted over different channels from one trial to the next. In this paper, a cohort-based speaker model synthesis (SMS) algorithm, designed for synthesizing robust speaker models without requiring channel-specific enrollment data, is proposed. This algorithm utilizes a priori knowledge of channels extracted from speaker-specific cohort sets to synthesize such speaker models. The cohort selection in the proposed new SMS can be either speaker-specific or Gaussian component based. Results on the China Criminal Police College (CCPC) speaker recognition corpus, which contains utterances from both landline and mobile channel, show the new algorithms yield significant speaker verification performance improvement over Htnorm and universal background model (UBM)-based speaker model synthesis.

Keywords

Gaussian processes; speaker recognition; speech synthesis; telephone networks; Gaussian component; cohort-based speaker model synthesis; mismatched channels; public telephone networks; speaker recognition; speaker verification; Algorithm design and analysis; Data mining; Degradation; Network synthesis; Robustness; Speaker recognition; Speech synthesis; Telephone sets; Telephony; Testing; Channel mismatch; cohort; speaker model synthesis; speaker verification;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TASL.2007.899297

Filename

4276767