A fast algorithm for stochastic matching with application to robust speaker verification

Author

Li, Qi ; Parthasarathy, S. ; Rosenberg, Aaron E.

Author_Institution

Bell Labs., Lucent Technol., Murray Hill, NJ, USA

Volume

2

fYear

1997

fDate

21-24 Apr 1997

Firstpage

1543

Abstract

Acoustic mismatch between training and test environments is one of the major problems in telephone-based speaker recognition. Speaker recognition performances are degraded when an HMM trained under one set of conditions is used to evaluate data collected from different telephone channels, microphones, etc. The mismatch can be approximated as a linear transform in a cepstral domain. In this paper, we present a fast, efficient algorithm to estimate the parameters of the linear transform for real-time applications. Using the algorithm, test data are transformed toward the training conditions by rotation, scale, and translation without, destroying the the detailed characteristics of speech, then, speaker dependent HMM´s can be used to evaluate the details under the same condition as training. Compared to cepstral mean subtraction (CMS) and other bias removal techniques, the proposed linear transform is more general since CMS and others only consider translation; compared to maximum-likelihood approaches for stochastic matching, the proposed algorithm is simpler and faster since iterative techniques are not required. The proposed algorithm improves the performance of a speaker verification system in the experiments reported in this paper

Keywords

cepstral analysis; hidden Markov models; parameter estimation; speaker recognition; stochastic processes; transforms; HMM; acoustic mismatch; bias removal techniques; cepstral domain; cepstral mean subtraction; fast algorithm; linear transform; maximum-likelihood approaches; real-time applications; robust speaker verification; rotation; scale; stochastic matching; telephone-based speaker recognition; test environments; training; translation; Acoustic testing; Cepstral analysis; Collision mitigation; Degradation; Hidden Markov models; Iterative algorithms; Performance evaluation; Robustness; Speaker recognition; Stochastic processes;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.596245

Filename

596245