Title :
Computational auditory scene analysis exploiting speech-recognition knowledge
Author_Institution :
Int. Comput. Sci. Inst., Berkeley, CA, USA
Abstract :
The field of computational auditory scene analysis (CASA) strives to build computer models of the human ability to interpret sound mixtures as the combination of distinct sources. A major obstacle to this enterprise is defining and incorporating the kind of high level knowledge of real-world signal structure exploited by listeners. Speech recognition, while typically ignoring the problem of nonspeech inclusions, has been very successful at deriving powerful statistical models of speech structure from training data. In this paper, we describe a scene analysis system that includes both speech and nonspeech components, addressing the problem of working backwards from speech recognizer output to estimate the speech component of a mixture. Ultimately, such hybrid approaches will require more radical adaptation of current speech recognition approaches
Keywords :
iterative methods; speech recognition; CASA; computational auditory scene analysis; distinct sources; high level knowledge; hybrid approaches; nonspeech component; real-world signal structure; sound mixtures; speech component; speech structure; speech-recognition knowledge; statistical models; Automatic speech recognition; Computer science; Hidden Markov models; Humans; Image analysis; Layout; Power system modeling; Speech analysis; Speech processing; Speech recognition;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics, 1997. 1997 IEEE ASSP Workshop on
Conference_Location :
New Paltz, NY
Print_ISBN :
0-7803-3908-8
DOI :
10.1109/ASPAA.1997.625625