Title :
A two-stage approach for improving the perceptual quality of separated speech
Author :
Williamson, Donald S. ; Yuxuan Wang ; DeLiang Wang
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
Abstract :
Binary time-frequency masking and model-based nonnegative matrix factorization (NMF) are two common approaches to speech separation. However, binary masking often suffers from poor perceptual quality, while NMF typically requires pretrained models for both speech and noise and frequently does not perform well. In this paper we examine whether a single or two-stage approach should be used for performing separation. We propose a two-stage algorithm that uses a soft mask in the first stage for separation, and NMF in the second stage for improving perceptual quality where only a speech model needs to be trained. We show that the proposed two-stage approach achieves higher objective perceptual quality and intelligibility compared to related single-stage methods.
Keywords :
matrix decomposition; speech intelligibility; speech processing; time-frequency analysis; NMF; binary time-frequency masking; model-based nonnegative matrix factorization; perceptual quality; single-stage method; speech intelligibility; speech separation; two-stage approach; Dictionaries; Hidden Markov models; Noise; Noise measurement; Signal processing algorithms; Speech; Speech processing; binary masking; nonnegative matrix factorization; speech quality; speech separation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854964