DocumentCode :
980799
Title :
Spectrum restoration from multiscale auditory phase singularities by generalized projections
Author :
Chi, Taishih ; Shamma, Shihab A.
Author_Institution :
Dept. of Commun. Eng., Nat. Chiao-Tung Univ., Hsinchu
Volume :
14
Issue :
4
fYear :
2006
fDate :
7/1/2006 12:00:00 AM
Firstpage :
1179
Lastpage :
1192
Abstract :
We examine the encoding of acoustic spectra by parameters derived from singularities found in their multiscale auditory representations. The multiscale representation is a wavelet transform of an auditory version of the spectrum, formulated based on findings of perceptual experiments and physiological research in the auditory cortex. The multiscale representation of a spectral pattern usually contains well-defined singularities in its phase function that reflect prominent features of the underlying spectrum such as its relative peak locations and amplitudes. Properties (locations and strength) of these singularities are examined and employed to reconstruct the original spectrum by using an iterative projection algorithm. Although the singularities form a nonconvex set, simulations demonstrate that a well-chosen initial pattern usually converges on a good approximation of the input spectrum. Perceptually intelligible speech can be resynthesized from the reconstructed auditory spectrograms, and hence these singularities can potentially serve as efficient features in speech compression. Besides, the singularities are very noise-robust which makes them useful features in various applications such as vowel recognition and speaker identification
Keywords :
audio coding; data compression; iterative methods; spectral analysis; speech coding; wavelet transforms; acoustic spectra encoding; auditory cortex; auditory spectrograms; generalized projections; intelligible speech; iterative projection algorithm; multiscale auditory phase singularities; multiscale representation; speaker identification; spectrum restoration; speech compression; vowel recognition; wavelet transform; Encoding; Filter bank; Image edge detection; Image reconstruction; Image restoration; Iterative algorithms; Projection algorithms; Spectrogram; Speech processing; Wavelet transforms; Auditory model; convex projection; phase singularity; spectrum restoration;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TSA.2005.860828
Filename :
1643647
Link To Document :
بازگشت