Title :
Underconstrained stochastic representations for top-down computational auditory scene analysis
Author :
Ellis, Daniel P W
Author_Institution :
Media Lab., MIT, Cambridge, MA, USA
Abstract :
I propose a structure for the first stage of a computer system capable of performing complex auditory scene analysis similar to that accomplished by human listeners. This structure contains the following innovations over previous approaches: (1) Sound is represented as discrete elements drawn from an overcomplete vocabulary encompassing both tonal and less structured sounds, designed to highlight the interdependence in the acoustic energy. (2) Through the redundancy of the basis this analysis permits and indeed requires the imposition of additional constraints, which provides for the incorporation of top-down or context-sensitive factors. (3) A modular architecture operates on an analysis-by-synthesis principle, where processes are invoked until the representation adequately accounts for the observed sound. A common goodness-of-fit criterion allows for future expansion of the system with new explanation rules, new representational elements and more abstract levels of analysis. Some initial results of applying these ideas to scenes consisting of noise bursts and dense environmental sound are presented
Keywords :
acoustic signal processing; hearing; pattern recognition; redundancy; signal representation; stochastic processes; acoustic energy; analysis-by-synthesis principle; computer system; context-sensitive factors; dense environmental sound; discrete elements; human listeners; less structured sounds; modular architecture; noise bursts; overcomplete vocabulary; redundancy; representational elements; tonal sounds; top-down computational auditory scene analysis; underconstrained stochastic representations; Acoustic noise; Computational modeling; Humans; Image analysis; Layout; Psychology; Stochastic processes; Technological innovation; Vocabulary; Working environment noise;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics, 1995., IEEE ASSP Workshop on
Conference_Location :
New Paltz, NY
Print_ISBN :
0-7803-3064-1
DOI :
10.1109/ASPAA.1995.482909