Text-independent speaker recognition based on the Hurst parameter and the multidimensional fractional Brownian motion model

Author

Ana, Ricardo Sant ; Coelho, Rosângela ; Alcaim, Abraham

Author_Institution

Electr. Eng. Dept., Inst. Militar de Engenharia, Rio de Janeiro, Brazil

Volume

14

Issue

3

fYear

2006

fDate

5/1/2006 12:00:00 AM

Firstpage

931

Lastpage

940

Abstract

In this paper, a text-independent automatic speaker recognition (ASkR) system is proposed-the SR_Hurst-which employs a new speech feature and a new classifier. The statistical feature pH is a vector of Hurst (H) parameters obtained by applying a wavelet-based multidimensional estimator (M_dim_wavelets ) to the windowed short-time segments of speech. The proposed classifier for the speaker identification and verification tasks is based on the multidimensional fBm (fractional Brownian motion) model, denoted by M_dim_fBm. For a given sequence of input speech features, the speaker model is obtained from the sequence of vectors of H parameters, means, and variances of these features. The performance of the SR_Hurst was compared to those achieved with the Gaussian mixture models (GMMs), autoregressive vector (AR), and Bhattacharyya distance (dB) classifiers. The speech database-recorded from fixed and cellular phone channels-was uttered by 75 different speakers. The results have shown the superior performance of the M_dim_fBm classifier and that the pH feature aggregates new information on the speaker identity. In addition, the proposed classifier employs a much simpler modeling structure as compared to the GMM.

Keywords

Brownian motion; Gaussian processes; autoregressive processes; speaker recognition; wavelet transforms; Bhattacharyya distance classifiers; Gaussian mixture models; Hurst parameter; autoregressive vector; multidimensional fractional Brownian motion model; speaker identification; text-independent automatic speaker recognition; text-independent speaker recognition; wavelet-based multidimensional estimator; Acoustic distortion; Brownian motion; Cepstrum; Fractals; Multidimensional systems; Robustness; Spatial databases; Speaker recognition; Speech processing; Stochastic processes; Automatic speaker recognition; Hurst parameter; multidimensional fractional Brownian motion; wavelet-based estimation;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TSA.2005.858054

Filename

1621205