DocumentCode :
1414284
Title :
Segmentation, Indexing, and Retrieval for Environmental and Natural Sounds
Author :
Wichern, Gordon ; Xue, Jiachen ; Thornburg, Harvey ; Mechtley, Brandon ; Spanias, Andreas
Author_Institution :
Sch. of Arts, Media, & Eng., Arizona State Univ., Tempe, AZ, USA
Volume :
18
Issue :
3
fYear :
2010
fDate :
3/1/2010 12:00:00 AM
Firstpage :
688
Lastpage :
707
Abstract :
We propose a method for characterizing sound activity in fixed spaces through segmentation, indexing, and retrieval of continuous audio recordings. Regarding segmentation, we present a dynamic Bayesian network (DBN) that jointly infers onsets and end times of the most prominent sound events in the space, along with an extension of the algorithm for covering large spaces with distributed microphone arrays. Each segmented sound event is indexed with a hidden Markov model (HMM) that models the distribution of example-based queries that a user would employ to retrieve the event (or similar events). In order to increase the efficiency of the retrieval search, we recursively apply a modified spectral clustering algorithm to group similar sound events based on the distance between their corresponding HMMs. We then conduct a formal user study to obtain the relevancy decisions necessary for evaluation of our retrieval algorithm on both automatically and manually segmented sound clips. Furthermore, our segmentation and retrieval algorithms are shown to be effective in both quiet indoor and noisy outdoor recording conditions.
Keywords :
Bayes methods; acoustic signal detection; acoustic signal processing; audio databases; audio signal processing; belief networks; content-based retrieval; database indexing; hidden Markov models; microphone arrays; pattern clustering; spectral analysis; continuous audio recording; distributed microphone arrays; dynamic Bayesian network; environmental sounds; event retrieval; example-based query; fixed space sound activity characterization; hidden Markov model; natural sounds; relevancy decision; retrieval search; sound clips; sound event; sound indexing; sound retrieval; sound segmentation; spectral clustering algorithm; Audio recording; Bayesian methods; Clustering algorithms; Databases; Hidden Markov models; Humans; Indexing; Layout; Microphone arrays; Speech; Acoustic signal analysis; Bayes procedures; acoustic signal detection; clustering methods; database query processing;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2041384
Filename :
5410056
Link To Document :
بازگشت