On the use of different speech representations for speaker modeling

Author

Chen, Ke

Author_Institution

Sch. of Informatics, Univ. of Manchester, UK

Volume

35

Issue

3

fYear

2005

Firstpage

301

Lastpage

314

Abstract

Numerous speech representations have been reported to be useful in speaker recognition. However, there is much less agreement on which speech representation provides a perfect representation of speaker-specific information conveyed in a speech signal. Unlike previous work, we propose an alternative approach to speaker modeling by the simultaneous use of different speech representations in an optimal way. Inspired by our previous empirical studies, we present a soft competition scheme on different speech representations to exploit different speech representations in encoding speaker-specific information. On the basis of this soft competition scheme, we present a parametric statistical model, generalized Gaussian mixture model (GGMM), to characterize a speaker identity based on different speech representations. Moreover, we develop an expectation-maximization algorithm for parameter estimation in the GGMM. The proposed speaker modeling approach has been applied to text-independent speaker recognition and comparative results on the KING speech corpus demonstrate its effectiveness.

Keywords

Gaussian processes; parameter estimation; speaker recognition; statistical analysis; KING speech corpus; expectation-maximization algorithm; generalized Gaussian mixture model; parameter estimation; parametric statistical model; soft competition scheme; speaker modeling; speaker-specific information encoding; speech representations; speech signal; text-independent speaker recognition; Cepstral analysis; Data mining; Encoding; Feature extraction; Pattern recognition; Signal processing; Speaker recognition; Speech analysis; Speech processing; Speech recognition; Different speech representations; KING speech corpus; expectation-maximazation (EM) algorithm; generalized Gaussian mixture model (GGMM); soft competition; speaker modeling; speaker recognition; speaker-specific information;

fLanguage

English

Journal_Title

Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on

Publisher

ieee

ISSN

1094-6977

Type

jour

DOI

10.1109/TSMCC.2005.848166

Filename

1487579