Title :
Mel Frequency Cepstral Coefficients (MFCC) based speaker identification in noisy environment using wiener filter
Author :
Chauhan, Paresh M. ; Desai, N.P.
Author_Institution :
Dept. of Inf. Technol., Dharmsinh Desai Univ., Nadiad, India
Abstract :
Speech processing is now an emerging technology of signal processing. Some research areas of speech processing are recognition of speech, speaker identification (SI), speech synthesis etc. Speaker identification is important research area of speech processing. SI means identifying the speaker based on his spoken speech. The main use of SI is to recognize the speech owner based on the speaking style of the speaker. SI is mainly used in forensic analysis, home control system, database access services etc. For SI two things are essential. One is feature extraction and another is feature matching. Feature extraction is extraction of small information from the available audio wave signal. That information can be used to represent the particular speaker. For SI, There are many feature extraction techniques like LPC (Linear Predictive Coefficients), MFCC (Mel Frequency Cepstral Coefficients), PLP (Perceptual Linear Predictive Coefficients) and many more are used. MFCC is one of them and it gives good (efficient) identification results. Factor affecting on SI is noise, sampling rate, number of frames etc., and among them noise is the most critical factor. We found that MFCC is not much effective in the noisy environment, especially when the noise condition mismatch. The identification rate becomes poor and poor when the noise level increases. To improve the performance of SI in a real world noisy environment, we propose a technique which is a variant of MFCC. Proposed MFCC includes wiener filter which is good for handling the noise in speech. In this paper, it is suggested that the wiener filter is effective in the frequency domain rather than the time domain based on our experiments. We got 88.57% average identification rate with NOIZEUS database by our proposed technique. In feature matching, the unknown speech is classified by using some classifier. We have used neural network for feature matching.
Keywords :
Wiener filters; cepstral analysis; feature extraction; neural nets; speaker recognition; speech synthesis; LPC; MFCC; NOIZEUS database; PLP; Wiener filter; feature extraction; feature matching; mel frequency cepstral coefficients; neural network; noise condition mismatch; noisy environment; perceptual linear predictive coefficients; signal processing; speaker identification; speech processing; speech recognition; speech synthesis; Feature extraction; Mel frequency cepstral coefficient; Noise; Noise measurement; Silicon; Speech; Training; Noise mismatch; Real world noisy environment; Speaker identification; Wiener filter;
Conference_Titel :
Green Computing Communication and Electrical Engineering (ICGCCEE), 2014 International Conference on
Conference_Location :
Coimbatore
DOI :
10.1109/ICGCCEE.2014.6921394