DocumentCode
730085
Title
A pairwise algorithm for pitch estimation and speech separation using deep stacking network
Author
Hui Zhang ; Xueliang Zhang ; Shuai Nie ; Guanglai Gao ; Wenju Liu
Author_Institution
Comput. Sci. Dept., Inner Mongolia Univ., Hohhot, China
fYear
2015
fDate
19-24 April 2015
Firstpage
246
Lastpage
250
Abstract
Pitch information is an important cue for speech separation. However, pitch estimation in noisy condition is also a task as challenging as speech separation. In this paper, we propose a supervised learning architecture which combines these two problems concisely. The proposed algorithm is based on deep stacking network (DSN) which provides a method of stacking simple processing modules in building deep architecture. In the training stage, an ideal binary mask is used as target. The input vector includes the outputs of lower module and frame-level features which consist of spectral and pitch-based features. In the testing stage, each module provides an estimated binary mask which is employed to re-estimate pitch. Then we update the pitch-based features to the next module. This procedure is embedded iteratively in DSN, and we obtain the final separation results from the last module of DSN. Systematic evaluations show that the proposed approach produces high quality estimated binary mask and outperforms recent systems in generalization.
Keywords
learning (artificial intelligence); speech processing; binary mask; deep stacking network; pairwise algorithm; pitch estimation; speech separation; supervised learning architecture; Noise; Speech; Testing; Training; Computational auditory scene analysis; Pitch estimation; Speech separation; Supervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7177969
Filename
7177969
Link To Document