Physiological Feature Extraction for Text Independent Speaker Identification using Non-Uniform Subband Processing

Author

Xugang Lu ; Jianwu Dang

Author_Institution

Sch. of Inf. Sci., Japan Sci. & Technol. Adv. Inst., Ishikawa, Japan

Volume

4

fYear

2007

fDate

15-20 April 2007

Abstract

The features used for speech recognition should emphasize linguistic information while suppressing speaker differences. For speaker recognition, features should have more speaker individual information while attenuating the linguistic information. In most studies, however, the identical acoustic features are used for the different missions of speaker and speech recognitions. In this paper, we propose a new physiological feature extraction method which emphasizes individual information for speaker identification. For the purpose, physiological features of speakers were analyzed from the point of view of speech production. It is found that the speaker individual information is encoded in different frequency regions of speech sound. The speaker discriminative information was quantified using Fisher´s F-ratio in each frequency region. Based on the F-ratio, we proposed a non-uniform sub-band processing strategy to extract new feature which can emphasize or refine the physiological aspects involved in speech production. We combined the new feature with GMM for speaker identification task and applied on NTT-VR speaker recognition database. Compared with MFCC feature, by using the proposed feature, the identification error rate was reduced 20.1%.

Keywords

Gaussian processes; feature extraction; speaker recognition; Fisher F-ratio; GMM; NTT-VR speaker recognition database; identification error rate; nonuniform subband processing; physiological feature extraction; speech recognition; text independent speaker identification; Data mining; Feature extraction; Frequency; Loudspeakers; Refining; Spatial databases; Speaker recognition; Speech analysis; Speech processing; Speech recognition; Speaker identification; non-uniform subband; physiological feature;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on

Conference_Location

Honolulu, HI

ISSN

1520-6149

Print_ISBN

1-4244-0727-3

Type

conf

DOI

10.1109/ICASSP.2007.366949

Filename

4218137