Title of article :
A perception- and PDE-based nonlinear transformation for processing spoken words
Author/Authors :
Qi ، نويسنده , , Yingyong and Xin، نويسنده , , Jack، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2001
Pages :
18
From page :
143
To page :
160
Abstract :
Speech signals are often produced or received in the presence of noise, which is known to degrade the performance of a speech recognition system. In this paper, a perception- and PDE-based nonlinear transformation was developed to process spoken words in noisy environment. Our goal is to distinguish essential speech features and suppress noise so that the processed words are better recognized by a computer software. The nonlinear transformation was made on the spectrogram (short-term Fourier spectra) of speech signals, which reveals the signal energy distribution in time and frequency. The transformation reduces noise through time adaptation (reducing temporally slowly varying portions of spectra) and enhances spectral peaks (formants) by evolving a focusing quadratic fourth-order PDE. Short-term spectra of speech signals were initially divided into three (low, mid and high) frequency bands based on the critical bandwidth of human audition. An algorithm was developed to trace the upper and lower intensity envelopes of signal in each band. The difference between the upper and lower envelopes reflects the signal-to-noise (SNR) ratio of each band. Constant, low SNR signals in each band were adaptively decreased to reduce noise. Then evolution of the focusing PDE was used to enhance the spectral peaks, and further reduce noise interference. Numerical results on noisy spoken words indicated that the transformed spectral pattern of the spoken words was insensitive to noise for SNR ranging from 0 to 20 dB (decibel). The spectral distances between noisy words and original words decreased after the transformation. A numerical experiment was performed on 11 spoken words at SNR=5 dB. A noisy word is recognized numerically by computing the closest L2 spectral distance from the clean template. The experiment reached a recognition rate as high as 100%. Analyses on the properties of the transformation are provided.
Keywords :
Spoken words processing , Nonlinear transformation , Noise , Perception and PDE
Journal title :
Physica D Nonlinear Phenomena
Serial Year :
2001
Journal title :
Physica D Nonlinear Phenomena
Record number :
1727140
Link To Document :
بازگشت