DocumentCode :
1118224
Title :
Combined Estimation of Spectral Envelopes and Sound Source Direction of Concurrent Voices by Multidimensional Statistical Filtering
Author :
Nix, Johannes ; Hohmann, Volker
Author_Institution :
Med. Phys. Group, Oldenburg Univ.
Volume :
15
Issue :
3
fYear :
2007
fDate :
3/1/2007 12:00:00 AM
Firstpage :
995
Lastpage :
1008
Abstract :
A key question for speech enhancement and simulations of auditory scene analysis in high levels of nonstationary noise is how to combine principles of auditory grouping and to integrate several noise-perturbed acoustical cues in a robust way. We present an application of recent online, nonlinear, non-Gaussian multidimensional statistical filtering methods which integrates tracking of sound-source direction and spectro-temporal dynamics of two mixed voices. The framework used is in agreement with the notion of evaluating competing hypotheses. To limit the number of hypotheses which need to be evaluated, the approach developed here uses a detailed statistical description of the high-dimensional spectro-temporal dynamics of speech, which is measured from a large speech database. The results show that the algorithm tracks sound source directions very precisely, separates the voice envelopes with algorithmic convergence times down to 50 ms, and enhances the signal-to-noise ratio in adverse conditions, requiring high computational effort. The approach has a high potential for improvements of efficiency and could be applied for voice separation and reduction of nonstationary noises
Keywords :
filtering theory; nonlinear filters; spectral analysis; speech enhancement; statistical analysis; auditory grouping; auditory scene analysis; concurrent voices; high-dimensional spectro-temporal speech dynamics; multidimensional statistical filtering; noise-perturbed acoustical cues; nonstationary noise; nonstationary noises; online nonlinear nonGaussian filtering method; signal-to-noise ratio; sound source direction; spectral envelope estimation; speech enhancement; voice separation; Acoustic noise; Analytical models; Databases; Filtering; Image analysis; Multidimensional systems; Noise level; Noise robustness; Speech analysis; Speech enhancement; Auditory scene analysis; binding; computational auditory scene analysis (CASA); nonstationary noise; speech enhancement;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2006.889788
Filename :
4100691
Link To Document :
بازگشت