Title :
Separation of speech from interfering sounds based on oscillatory correlation
Author :
Wang, DeLiang L. ; Brown, Guy J.
Author_Institution :
Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA
fDate :
5/1/1999 12:00:00 AM
Abstract :
A multistage neural model is proposed for an auditory scene analysis task-segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized relaxation oscillators, each of which corresponds to an auditory feature, and different streams are represented by desynchronized oscillator populations. Lateral connections between oscillators encode harmonicity, and proximity in frequency and time. Prior to the oscillator network are a model of the auditory periphery and a stage in which mid-level auditory representations are formed. The model has been systematically evaluated using a corpus of voiced speech mixed with interfering sounds, and produces improvements in terms of signal-to-noise ratio for every mixture. A number of issues including biological plausibility and real-time implementation are also discussed
Keywords :
correlation methods; harmonic analysis; neural nets; speech coding; speech recognition; auditory scene analysis; encoding; harmonicity; multistage neural model; oscillatory correlation; real-time system; speech segregation; speech signal separation; stream segregation; two-layer oscillator network; Automatic speech recognition; Biological system modeling; Cognitive science; Ear; Frequency; Image analysis; Inference algorithms; Oscillators; Speech analysis; Speech recognition;
Journal_Title :
Neural Networks, IEEE Transactions on