Title :
Binaural Classification for Reverberant Speech Segregation Using Deep Neural Networks
Author :
Yi Jiang ; DeLiang Wang ; Runsheng Liu ; ZhenMing Feng
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Abstract :
Speech signal degradation in real environments mainly results from room reverberation and concurrent noise. While human listening is robust in complex auditory scenes, current speech segregation algorithms do not perform well in noisy and reverberant environments. We treat the binaural segregation problem as binary classification, and employ deep neural networks (DNNs) for the classification task. The binaural features of the interaural time difference and interaural level difference are used as the main auditory features for classification. The monaural feature of gammatone frequency cepstral coefficients is also used to improve classification performance, especially when interference and target speech are collocated or very close to one another. We systematically examine DNN generalization to untrained spatial configurations. Evaluations and comparisons show that DNN-based binaural classification produces superior segregation performance in a variety of multisource and reverberant conditions.
Keywords :
neural nets; signal classification; speech processing; DNN-based binaural classification; concurrent noise; deep neural networks; gammatone frequency cepstral coefficients; interaural level difference; interaural time difference; reverberant speech segregation algorithm; room reverberation; speech signal degradation; Azimuth; Feature extraction; Interference; Signal to noise ratio; Speech; Training; Binary classification; computational auditory scene analysis (CASA); deep neural networks (DNNs); room reverberation; speech segregation;
Journal_Title :
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
DOI :
10.1109/TASLP.2014.2361023