Title :
Towards efficient audio thumbnailing
Author :
Nanzhu Jiang ; Müller, Meinard
Author_Institution :
Int. Audio Labs. Erlangen, Erlangen, Germany
Abstract :
Audio thumbnailing, which aims at finding the most representative audio segment of a music recording, is an important task in music information retrieval. In this paper, we show how the computational efficiency of a recently proposed state-of-the-art thumbnailing approach can be improved significantly. The basic idea of the previous approach is to compute for each possible segment a fitness value that expresses repetitiveness and then to define the thumbnail as the fitness-maximizing segment. As a first acceleration strategy, we propose an efficient multi-level sampling strategy to reduce the number of segments the fitness has to be computed for. Second, we obtain further accelerations by suitably adjusting the resolution used in the fitness computation depending on the level of the segment. As a third contribution, we exploit an intrinsic property of the fitness computation that allows us to estimate the fitness for certain segments without any further computation. Our experimental results show that combining these three strategies leads to accelerations by a factor of 20 to 200 depending on the duration of the song while keeping the overall accuracy for the thumbnail estimation.
Keywords :
audio recording; audio signal processing; signal sampling; acceleration strategy; efficient audio thumbnailing approach; fitness-maximizing segment; multilevel sampling strategy; music information retrieval; music recording; representative audio segment; thumbnail estimation; Acceleration; Accuracy; Audio recording; Estimation; Laboratories; Music; Visualization; Audio structure analysis; audio thumbnailing; efficiency;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854593