مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio scene segmentation using multiple features, models and time scales

DocumentCode :

352480

Title :

Audio scene segmentation using multiple features, models and time scales

Author :

Sundaram, Hari ; Chang, Shih-Fu

Author_Institution :

Dept. of Electr. Eng., Columbia Univ., New York, NY, USA

Volume :

fYear :

2000

fDate :

2000

Firstpage :

2441

Abstract :

We present an algorithm for audio scene segmentation. An audio scene is a semantically consistent sound segment that is characterized by a few dominant sources of sound. A scene change occurs when a majority of the sources present in the data change. Our segmentation framework has three parts: a definition of an audio scene; multiple feature models that characterize the dominant sources; and a simple, causal listener model, which mimics human audition using multiple time-scales. We define a correlation function that determines correlation with past data to determine segmentation boundaries. The algorithm was tested on a difficult data set, a 1 hour audio segment of a film, with impressive results. It achieves an audio scene change detection accuracy of 97%

Keywords :

audio signal processing; hearing; multimedia systems; audio scene change detection; audio scene definition; audio scene segmentation; causal listener model; correlation function; data set; human audition; multimedia; multiple feature models; multiple time-scales; Data mining; Detection algorithms; Feature extraction; Layout; Music; Organizing; Speech; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on

Conference_Location :

Istanbul

ISSN :

1520-6149

Print_ISBN :

0-7803-6293-4

Type :

conf

DOI :

10.1109/ICASSP.2000.859335

Filename :

859335

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=352480