Title :
Environment-aware ideal binary mask estimation using monaural cues
Author :
May, Torsten ; Dau, Torsten
Author_Institution :
Centre for Appl. Hearing Res., Tech. Univ. of Denmark, Lyngby, Denmark
Abstract :
We present a monaural approach to speech segregation that estimates the ideal binary mask (IBM) by combining amplitude modulation spectrogram (AMS) features, pitch-based features and speech presence probability (SPP) features derived from noise statistics. To maintain a high mask estimation accuracy in the presence of various background noises, the system employs environment-specific segregation models and automatically selects the appropriate model for a given input signal. Furthermore, instead of classifying each time-frequency (T-F) unit independently, the a posteriori probabilities of speech and noise presence are evaluated by considering adjacent T-F units. The proposed system achieves high classification accuracy.
Keywords :
probability; signal classification; speech processing; time-frequency analysis; AMS; IBM; SPP; amplitude modulation spectrogram features; environment-aware ideal binary mask estimation; monaural approach; pitch-based features; speech presence probability features; speech segregation; time-frequency unit; Accuracy; Acoustics; Estimation; Noise measurement; Signal to noise ratio; Speech; background noise classification; ideal binary mask estimation; speech segregation;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013 IEEE Workshop on
Conference_Location :
New Paltz, NY
DOI :
10.1109/WASPAA.2013.6701821