DocumentCode :
2946958
Title :
The audio epitome: a new representation for modeling and classifying auditory phenomena
Author :
Kapoor, Ashish ; Basu, Sumit
Author_Institution :
MIT, MA, USA
Volume :
5
fYear :
2005
fDate :
18-23 March 2005
Abstract :
The paper presents a novel representation for auditory environments that can be used for classifying events of interest, such as speech, cars, etc., and potentially used to classify the environments themselves. We propose a novel discriminative framework that is based on the audio epitome, an audio extension of the image representation developed by N. Jojic et al. (see Proc. Int. Conf. Comp. Vision, 2003). We also develop an informative patch sampling procedure to train the epitomes. This procedure reduces the computational complexity and increases the quality of the epitome. For classification, the training data is used to learn distributions over the epitomes to model the different classes; the distributions for new inputs are then compared to these models. On a task of distinguishing between 4 auditory classes in the context of environmental sounds (car, speech, birds, utensils), our method outperforms the conventional approaches of nearest neighbor and mixture of Gaussians on three out of the four classes.
Keywords :
audio signal processing; computational complexity; signal classification; signal representation; signal sampling; Gaussian mixture; audio epitome; auditory classes; auditory phenomena classification; auditory phenomena modeling; bird sounds; car sounds; computational complexity; discriminative framework; environmental sounds; image representation; informative patch sampling procedure; nearest neighbor; speech sounds; training data; utensil sounds; Birds; Computational complexity; Gaussian processes; Image reconstruction; Image representation; Image sampling; Image segmentation; Nearest neighbor searches; Speech; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8874-7
Type :
conf
DOI :
10.1109/ICASSP.2005.1416272
Filename :
1416272
Link To Document :
بازگشت