Missing feature reconstruction methods for robust speaker identification

Author

Xueliang Zhang ; Hui Zhang ; Guanglai Gao

Author_Institution

Comput. Sci. Dept., Inner Mongolia Univ., Hohhot, China

fYear

2014

fDate

1-5 Sept. 2014

Firstpage

1482

Lastpage

1486

Abstract

In this study, we propose a reconstruction method to restore the degraded features for robust speaker identification. The proposed method is based on a hybrid generative model which consists of deep belief network (DBN) and restricted Boltzmann machine (RBM). Specifically, the noisy speech is firstly decomposed into time-frequency (T-F) representations. Then ideal binary mask (IBM) is computed to indicate each T-F point as reliable or unreliable. We reconstruct the unreliable ones by the proposed model iteratively. Finally, reconstructed feature is utilized to conventional speaker identification system. Experiments demonstrate that the proposed method achieves significant performance improvements over previous missing feature techniques under a wide range of signal-to-noise ratios.

Keywords

signal reconstruction; signal representation; signal restoration; speaker recognition; time-frequency analysis; DBN; IBM; RBM; T-F point; T-F representations; deep belief network; hybrid generative model; ideal binary mask; missing feature reconstruction methods; noisy speech; restricted Boltzmann machine; robust speaker identification system; signal-to-noise ratios; time-frequency representations; Abstracts; Adaptation models; Computational modeling; Data models; Production facilities; Robustness; Smoothing methods; Deep belief network; Missing feature techniques; Restricted Boltzmann machine; Robust speaker identification;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European

Conference_Location

Lisbon

Type

conf

Filename

6952536