Title :
Nonlinear cepstral equalisation method for noisy speech recognition
Author :
Lee, L.-M. ; Chen, J.-K. ; Wang, H.-C.
Author_Institution :
Dept. of Electr. Eng., Nat. Tsing Hua Univ., Hsinchu, Taiwan
fDate :
12/1/1994 12:00:00 AM
Abstract :
The authors deal with the problem of automatic speech recognition in the presence of additive white noise. The effect of noise is modelled as an additive term to the power spectrum of the original clean speech. The cepstral coefficients of the noisy speech are then derived from this model. The reference cepstral vectors trained from clean speech are adapted to their appropriate noisy version to best fit the testing speech cepstral vector. The LPC coefficients, LPC derived cepstral coefficients, and the distance between test and reference, are all regarded as functions of the noise ratio (the spectral power ratio of noise to noisy speech). A gradient based algorithm is proposed to find the optimal noise ratio as well as the minimum distance between the test cepstral vector and the noise adapted reference. A recursive algorithm based on Levinson-Durbin recursion is proposed to simultaneously calculate the LPC coefficients and the derivatives of the LPC coefficients with respect to the noise ratio. The stability of the proposed adaptation algorithm is also addressed. Experiments on multispeaker (50 males and 50 females) isolated Mandarin digits recognition demonstrate remarkable performance improvements over noncompensated method under noisy environment. The results are also compared to the projection based approach, and experiments show that the proposed method is superior to the projection approach under a severe noisy environment
Keywords :
autoregressive processes; cepstral analysis; equalisers; linear predictive coding; natural languages; recursive functions; speech processing; speech recognition; white noise; LPC coefficients; LPC derived cepstral coefficients; Levinson-Durbin recursion; additive white noise; automatic speech recognition; gradient based algorithm; minimum distance; multispeaker isolated Mandarin digits recognition; noise ratio; noisy speech recognition; nonlinear cepstral equalisation method; optimal noise ratio; power spectrum; projection based approach; recursive algorithm; spectral power ratio;
Journal_Title :
Vision, Image and Signal Processing, IEE Proceedings -
DOI :
10.1049/ip-vis:19941373