Title :
Improved A Posteriori Speech Presence Probability Estimation Based on a Likelihood Ratio With Fixed Priors
Author :
Gerkmann, Timo ; Breithaupt, Colin ; Martin, Rainer
Author_Institution :
Inst. of Commun. Acoust. (IKA), Ruhr- Univ. Bochum, Bochum
fDate :
7/1/2008 12:00:00 AM
Abstract :
In this paper, we present an improved estimator for the speech presence probability at each time-frequency point in the short-time Fourier transform domain. In contrast to existing approaches, this estimator does not rely on an adaptively estimated and thus signal-dependent a priori signal-to-noise ratio estimate. It therefore decouples the estimation of the speech presence probability from the estimation of the clean speech spectral coefficients in a speech enhancement task. Using both a fixed a priori signal-to-noise ratio and a fixed prior probability of speech presence, the proposed a posteriori speech presence probability estimator achieves probabilities close to zero for speech absence and probabilities close to one for speech presence. While state-of-the-art speech presence probability estimators use adaptive prior probabilities and signal-to-noise ratio estimates, we argue that these quantities should reflect true a priori information that shall not depend on the observed signal. We present a detection theoretic framework for determining the fixed a priori signal-to-noise ratio. The proposed estimator is conceptually simple and yields a better tradeoff between speech distortion and noise leakage than state-of-the-art estimators.
Keywords :
Fourier transforms; maximum likelihood estimation; signal detection; speech enhancement; a posteriori speech presence probability estimation; detection theoretic framework; fixed prior probability; noise leakage; short-time Fourier transform domain; signal-to-noise ratio estimation; speech distortion; speech enhancement task; speech spectral coefficients; time-frequency point; Generalized likelihood ratio; softgain; speech analysis; speech enhancement; speech presence probability (SPP);
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2008.921764