Deep neural networks for cochannel speaker identification

Author

Xiaojia Zhao ; Yuxuan Wang ; DeLiang Wang

Author_Institution

Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA

fYear

2015

fDate

19-24 April 2015

Firstpage

4824

Lastpage

4828

Abstract

Speaker identification (SID) in cochannel speech, where two speakers are talking simultaneously over a single recording channel, is a challenging problem. Previous studies address this problem in the anechoic environment under the Gaussian mixture model (GMM) framework. On the other hand, cochannel SID in reverberant conditions has not been addressed. This paper studies cochannel SID in both anechoic and reverberant conditions. We explore deep neural networks (DNNs) for cochannel SID and propose a DNN-based recognition system. Evaluation results demonstrate the proposed DNN-based system outperforms the two state-of-the-art cochannel SID systems in both anechoic and reverberant conditions and various target-to-interferer ratios.

Keywords

neural nets; source separation; speech recognition; anechoic conditions; cochannel speaker identification; cochannel speech; deep neural networks; reverberant conditions; Accuracy; NIST; Robustness; Speech; Speech recognition; Training; Training data; Cochannel speaker identification; Gaussian mixture model; deep neural network; reverberation; target-to-interferer ratio;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178887

Filename

7178887