DocumentCode
1749214
Title
An extended model for speech segregation
Author
Hu, Guoning ; Wang, DeLiang
Author_Institution
Biophys. Program, Ohio State Univ., Columbus, OH, USA
Volume
2
fYear
2001
fDate
2001
Firstpage
1089
Abstract
Speech segregation is an important task of auditory scene analysis (ASA), in which the speech of a certain speaker is separated from other interfering signals. Wang and Brown (1999) proposed a multistage neural model for speech segregation, the core of which is a two-layer oscillator network. We extend their model by adding further processes based on psychoacoustic evidence to improve the performance. These processes include estimation of the pitch of target speech and refined generation of a target speech stream with the estimated pitch. Our model is systematically evaluated and compared with the Wang-Brown model, and it yields significantly better performance
Keywords
neural nets; physiological models; speech processing; speech synthesis; auditory scene analysis; extended model; multistage neural model; pitch estimation; psychoacoustic evidence; speech segregation; two-layer oscillator network; Automatic speech recognition; Image analysis; Oscillators; Psychoacoustic models; Psychology; Speech analysis; Speech enhancement; Speech processing; Speech synthesis; Wideband;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on
Conference_Location
Washington, DC
ISSN
1098-7576
Print_ISBN
0-7803-7044-9
Type
conf
DOI
10.1109/IJCNN.2001.939512
Filename
939512
Link To Document