Speaker recognition using neural responses from the model of the auditory system

Author

Razali, Noor Fadzilah ; Jassim, Wissam A. ; Roohisefat, Leyla ; Zilany, Muhammad S. A.

Author_Institution

Fac. of Electr. Eng., Univ. Technol. MARA (UiTM), Shah Alam, Malaysia

fYear

2014

fDate

1-4 Dec. 2014

Abstract

Speaker recognition is a process of determining a person´s identity using features in speech signals. In this study, a new speaker recognition (identification and verifica-tion) system is proposed using the responses from a computational model of the auditory system. A neurogram (2D) was constructed from the responses of the model of auditory nerve fibers for a range of characteristic frequencies. The proposed neurogram based speaker recognition system was trained and tested using a Gaussian mixture model classification technique. The performance of the proposed method was evaluated for both clean speech and speech under noisy environment. The result of the proposed method was compared to a traditional speaker recognition technique, referred to as the mel-frequency cepstral coefficient method. The proposed method showed better performance than the traditional approach, especially under noisy conditions. The proposed method could be applied in security and voice recognition systems.

Keywords

Gaussian processes; cepstral analysis; mixture models; neural nets; signal classification; speaker recognition; speech processing; 2D neurogram; Gaussian mixture model classification technique; auditory nerve fibers; auditory system; auditory system model; characteristic frequencies; clean speech condition; computational model; mel-frequency cepstral coefficient method; neural responses; noisy speech condition; performance evaluation; person identity; security systems; speaker identification system; speaker recognition; speaker verification system; speech signals; voice recognition systems; Mel frequency cepstral coefficient; Noise measurement; Robustness; Signal to noise ratio; Speaker recognition; Speech; Gaussian Mixture Model; auditory nerve model; mel-frequency cepstral coefficient; neurogram; speaker identification; speaker verification;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Signal Processing and Communication Systems (ISPACS), 2014 International Symposium on

Conference_Location

Kuching

Type

conf

DOI

10.1109/ISPACS.2014.7024428

Filename

7024428