Title :
Sparse time-frequency representations in audio processing, as studied through a symmetrized lognormal model
Author :
Wolfe, Patrick J.
Author_Institution :
Dept. of Stat., Harvard Univ., Cambridge, MA, USA
Abstract :
Time-frequency representations are ubiquitous in speech and audio signal processing, their use being motivated by both auditory physiology and the mathematics of Fourier analysis. Nonpara-metric statistical models (or equivalently transform based signal processing methods) formulated in this space provide a principled way to decompose sounds into their constituent parts, as well as an effective means of exploiting the local correlation present in the time-frequency structure of naturally generated acoustic signals. Here we describe how an appropriate generative statistical model, even under very simple assumptions, provides a means of exploring sparse time-frequency representations in audio. We introduce a symmetrized lognormal model for spectral coefficients, which shows good agreement across a broad range of speech samples taken from the TIMIT database, and demonstrate preliminary speech enhancement results based on a maximum a posteriori shrinkage estimator.
Keywords :
Fourier analysis; audio signal processing; nonparametric statistics; signal representation; spectral analysis; time-frequency analysis; Fourier analysis; TIMIT database; acoustic signals; audio signal processing; auditory physiology; maximum a posteriori shrinkage estimator; nonparametric statistical models; sparse time-frequency representations; spectral coefficients; speech enhancement; speech signal processing; symmetrized lognormal model; Histograms; Signal to noise ratio; Speech; Speech processing; Time-frequency analysis; Transforms;
Conference_Titel :
Signal Processing Conference, 2007 15th European
Conference_Location :
Poznan
Print_ISBN :
978-839-2134-04-6