DocumentCode :
661416
Title :
Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system
Author :
Ping-Keng Jao ; Li Su ; Yi-Hsuan Yang
Author_Institution :
Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
fYear :
2013
fDate :
Oct. 29 2013-Nov. 1 2013
Firstpage :
1
Lastpage :
8
Abstract :
Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although the general approach of dictionary-based MIR has been shown effective, little work has been done to investigate the relationship between system performance and dictionary properties, such as the dictionary sparsity, coherence, and conditional number of the dictionary. This paper aims at addressing this issue by systematically evaluating the performance of three types of dictionary learning algorithms for the task of genre classification, including the least-square based RLS (recursive least square) algorithm, and two variants of the stochastic gradient descent-based algorithm ODL (online dictionary learning) with different regularization functions. Specifically, we learn the dictionary with the USPOP2002 dataset and perform genre classification with the GTZAN dataset. Our result shows that setting strict sparsity constraints in the RLS-based dictionary learning (i.e., <;1% of the signal dimension) leads to better accuracy in genre classification (around 80% when linear kernel support vector classifier is adopted). Moreover, we find that different sparsity constraints are needed for the dictionary learning phase and the encoding phase. Important links between dictionary properties and classification accuracy are also identified, such as a strong correlation between reconstruction error and classification accuracy in all algorithms. These findings help the design of future dictionary-based MIR systems and the selection of important dictionary learning parameters.
Keywords :
audio databases; gradient methods; information retrieval; learning (artificial intelligence); least squares approximations; music; pattern classification; stochastic processes; GTZAN dataset; MIR community; RLS-based dictionary learning algorithms; USPOP2002 dataset; classification accuracy; dictionary coherence; dictionary properties; dictionary sparsity; dictionary-based MIR systems; genre classification; large-scale music database; least-square based RLS algorithm; linear kernel support vector classifier; music information retrieval community; online dictionary learning; performance evaluation; reconstruction error; recursive least square algorithm; regularization functions; signal dimension; sparsity constraints; stochastic gradient descent-based algorithm ODL; Accuracy; Algorithm design and analysis; Classification algorithms; Dictionaries; Encoding; Kernel; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
Conference_Location :
Kaohsiung
Type :
conf
DOI :
10.1109/APSIPA.2013.6694278
Filename :
6694278
Link To Document :
بازگشت