Title :
A structure-preserving training target for supervised speech separation
Author :
Yuxuan Wang ; DeLiang Wang
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Abstract :
Supervised learning based speech separation has shown considerable success recently. In its simplest form, a discriminative model is trained as a time-frequency masking function, where the training target is an ideal mask. Ideal masks, such as the ideal binary masks, are structured spectro-temporal patterns. However, previous formulations do not model prominent output structure. In this paper, we propose an alternative training target that is explicitly related to mask structure. We first learn a compositional model of the square-root ideal ratio mask that is closely related to the Wiener filter. Instead of directly estimating the ideal mask values, we learn to predict the weights for resulting mask-level spectro-temporal bases, which are then used to generate the estimated masks. In other words, the discriminative model is used to predict the parameters of a generative model of the target of interest. Experimental results show consistent improvements in low SNR conditions by adopting the new training target.
Keywords :
Wiener filters; learning (artificial intelligence); source separation; speech intelligibility; speech processing; SNR conditions; Wiener filter; compositional model; discriminative model; generative model; ideal binary masks; ideal mask values; mask structure; mask-level spectrotemporal bases; square-root ideal ratio mask; structure-preserving training target; structured spectrotemporal patterns; supervised learning based speech separation; time-frequency masking function; Noise measurement; Signal to noise ratio; Spectrogram; Speech; Speech enhancement; Training; Speech separation; deep neural networks; spectro-temporal patterns; training target;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854777