Title :
Robust speech recognition using MLP neural network in log-spectral domain
Author :
Ghaemmaghami, Masoumeh P. ; Sameti, Hossein ; Razzazi, Farbod ; BabaAli, Bagher ; Dabbaghchian, Saeed
Author_Institution :
Dept. of Electr. Eng., Islamic Azad Univ., Tehran, Iran
Abstract :
In this paper, we have proposed an efficient and effective nonlinear feature domain noise suppression algorithm, motivated by the minimum mean square error (MMSE) optimization criterion. A Multi Layer Perceptron (MLP) neural network in the log spectral domain has been employed to minimize the difference between noisy and clean speech. By using this method, as a pre-processing stage of a speech recognition system, the recognition rate in noisy environments has been improved. We extended the application of the system to different environments with different noises without retraining HMM model. We trained the feature extraction stage with a small portion of noisy data which was created by artificially adding different types of noises from the NOISEX-92 database to the TIMIT speech database. In real environment, where our speech recognition systems must work, different types of noises with various SNRs exist. Our proposed method suggests four strategies based on the system capability to identify the noise type and SNR. Experimental results show that the proposed method achieves significant improvement in recognition rates.
Keywords :
feature extraction; interference suppression; least mean squares methods; multilayer perceptrons; spectral analysis; speech recognition; MLP neural network; NOISEX-92 database; SNR; TIMIT speech database; feature extraction; log spectral domain; minimum mean square error; multi layer perceptron neural network; noisy environment; nonlinear feature domain noise suppression algorithm; preprocessing stage; recognition rates; robust speech recognition; Cepstral analysis; Feature extraction; Hidden Markov models; Neural networks; Noise reduction; Noise robustness; Signal to noise ratio; Speech enhancement; Speech recognition; Working environment noise; MLP neural network; log spectral; robust speech recognition;
Conference_Titel :
Signal Processing and Information Technology (ISSPIT), 2009 IEEE International Symposium on
Conference_Location :
Ajman
Print_ISBN :
978-1-4244-5949-0
DOI :
10.1109/ISSPIT.2009.5407513