DocumentCode :
3164215
Title :
Parameter-Free Audio Motif Discovery in Large Data Archives
Author :
Yuan Hao ; Shokoohi-Yekta, Mohammad ; Papageorgiou, George ; Keogh, Eamonn
fYear :
2013
fDate :
7-10 Dec. 2013
Firstpage :
261
Lastpage :
270
Abstract :
The discovery of repeated structure, i.e. motifs/near-duplicates, is often the first step in exploratory data mining. As such, the last decade has seen extensive research efforts in motif discovery algorithms for text, DNA, time series, protein sequences, graphs, images, and video. Surprisingly, there has been less attention devoted to finding repeated patterns in audio sequences, in spite of their ubiquity in science and entertainment. While there is significant work for the special case of motifs in music, virtually all this work makes many assumptions about data (often to the point of being genre specific) and thus these algorithms do not generalize to audio sequences containing animal vocalizations, industrial processes, or a host of other domains that we may wish to explore. In this work we introduce a novel technique for finding audio motifs. Our method does not require any domain-specific tuning and is essentially parameter-free. We demonstrate our algorithm on very diverse domains, finding audio motifs in laboratory mice vocalizations, wild animal sounds, music, and human speech. Our experiments demonstrate that our ideas are effective in discovering objectively correct or subjectively plausible motifs. Moreover, we show our novel probabilistic early abandoning approach is efficient, being two to three orders of magnitude faster than brute-force search, and thus faster than real-time for most problems.
Keywords :
audio signal processing; data mining; music; probability; animal vocalizations; audio motif finding technique; audio sequence repeated pattern finding; brute-force search; domain-specific tuning; exploratory data mining; human speech; industrial processes; laboratory mice vocalizations; large data archives; parameter-free audio motif discovery; probabilistic early abandoning approach; repeated structure discovery; wild animal sounds; Data mining; Feature extraction; Heuristic algorithms; Mice; Music; Spectrogram; Speech; anytime algorithm; audio motif; spectrogram;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
ISSN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2013.30
Filename :
6729510
Link To Document :
بازگشت