A framework for audio analysis based on classification and temporal segmentation

Author

Tzanetakis, George ; Cook, Perry

Author_Institution

Dept. of Comput. Sci., Princeton Univ., NJ, USA

Volume

2

fYear

1999

fDate

1999

Firstpage

61

Abstract

Existing audio tools handle the increasing amount of computer audio data inadequately. The typical tape-recorder paradigm for audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic audio analysis and annotation is impossible using current techniques. Alternative solutions are semi-automatic user interfaces that let users interact with sound in flexible ways based on content. This approach offers significant advantages over manual browsing, annotation and retrieval. Furthermore, it can be implemented using existing techniques for audio content analysis in restricted domains. This paper describes a framework for experimenting evaluating and integrating such techniques. As a test for the architecture, some recently proposed techniques have been implemented and tested. In addition, a new method for temporal segmentation based on audio texture is described. This method is combined with audio analysis techniques and used for hierarchical browsing classification and annotation of audio files

Keywords

audio signal processing; speech-based user interfaces; annotation; audio analysis; audio interfaces; audio tools; classification; completely automatic audio analysis; computer audio data; framework; manual browsing; tape-recorder paradigm; temporal segmentation; Computer interfaces; Computer science; Humans; Information analysis; Information retrieval; Internet; Optical computing; Search engines; Testing; Web search;

fLanguage

English

Publisher

ieee

Conference_Titel

EUROMICRO Conference, 1999. Proceedings. 25th

Conference_Location

Milan

ISSN

1089-6503

Print_ISBN

0-7695-0321-7

Type

conf

DOI

10.1109/EURMIC.1999.794763

Filename

794763