مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning Spectral Mapping for Speech Dereverberation and Denoising

DocumentCode :

112952

Title :

Learning Spectral Mapping for Speech Dereverberation and Denoising

Author :

Kun Han ; Yuxuan Wang ; DeLiang Wang ; Woods, William S. ; Merks, Ivo ; Tao Zhang

Author_Institution :

Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA

Volume :

Issue :

fYear :

2015

fDate :

Jun-15

Firstpage :

982

Lastpage :

992

Abstract :

In real-world environments, human speech is usually distorted by both reverberation and background noise, which have negative effects on speech intelligibility and speech quality. They also cause performance degradation in many speech technology applications, such as automatic speech recognition. Therefore, the dereverberation and denoising problems must be dealt with in daily listening environments. In this paper, we propose to perform speech dereverberation using supervised learning, and the supervised approach is then extended to address both dereverberation and denoising. Deep neural networks are trained to directly learn a spectral mapping from the magnitude spectrogram of corrupted speech to that of clean speech. The proposed approach substantially attenuates the distortion caused by reverberation, as well as background noise, and is conceptually simple. Systematic experiments show that the proposed approach leads to significant improvements of predicted speech intelligibility and quality, as well as automatic speech recognition in reverberant noisy conditions. Comparisons show that our approach substantially outperforms related methods.

Keywords :

learning (artificial intelligence); neural nets; reverberation; speech intelligibility; speech recognition; automatic speech recognition; background noise; corrupted speech; deep neural networks; human speech; magnitude spectrogram; real-world environments; reverberant noisy conditions; reverberation noise; spectral mapping; speech denoising; speech dereverberation; speech intelligibility; speech quality; supervised learning; Noise reduction; Reverberation; Spectrogram; Speech; Speech processing; Time-domain analysis; Training; Deep neural networks (DNNs); denoising; dereverberation; spectral mapping; supervised learning;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE/ACM Transactions on

Publisher :

ieee

ISSN :

2329-9290

Type :

jour

DOI :

10.1109/TASLP.2015.2416653

Filename :

7067387

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=112952