Spectro-temporal smoothed auditory spectra for robust speaker identification

Author

Lin, Ting-Han ; Hsu, Chung-Chien ; Chi, Tai-Shih

Author_Institution

Dept. of Electr. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan

fYear

2010

fDate

Nov. 29 2010-Dec. 3 2010

Firstpage

313

Lastpage

317

Abstract

The performance of conventional speaker identification systems is severely compromised by interference, such as additive or convolutional noises. High-level information of the speaker provides more robust cues for identifying speakers. This paper proposes an auditory-model based spectro-temporal modulation filtering (STMF) process to capture high-level information for robust speaker identification. Text-independent closed-set speaker identification simulations are conducted on TIMIT and GRID corpora to evaluate the robustness of Auditory Cepstral Coefficients (ACCs) after the STMF process. Simulation results show ACCs´ substantial improvement over conventional MFCCs in all SNR conditions. The superior noise-suppression performance of STMF to newly developed Auditory-based Nonnegative Tensor Cepstral Coefficients (ANTCCs) is also demonstrated in low SNR conditions.

Keywords

filtering theory; speaker recognition; ACC; ANTCC; GRID corpora; STMF; TIMIT corpora; additive noises; auditory based nonnegative tensor cepstral coefficients; auditory cepstral coefficients; convolutional noises; robust speaker identification; spectro temporal modulation filtering; spectro temporal smoothed auditory spectra; text independent closed-set speaker identification simulations; auditory feature; gaussian mixture model; speaker identification; spectro-temporal modulation;

fLanguage

English

Publisher

ieee

Conference_Titel

Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on

Conference_Location

Tainan

Print_ISBN

978-1-4244-6244-5

Type

conf

DOI

10.1109/ISCSLP.2010.5684884

Filename

5684884