مرکز منطقه ای اطلاع رساني علوم و فناوري - Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system

DocumentCode :

661416

Title :

Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system

Author :

Ping-Keng Jao ; Li Su ; Yi-Hsuan Yang

Author_Institution :

Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan

fYear :

2013

fDate :

Oct. 29 2013-Nov. 1 2013

Firstpage :

Lastpage :

Abstract :

Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although the general approach of dictionary-based MIR has been shown effective, little work has been done to investigate the relationship between system performance and dictionary properties, such as the dictionary sparsity, coherence, and conditional number of the dictionary. This paper aims at addressing this issue by systematically evaluating the performance of three types of dictionary learning algorithms for the task of genre classification, including the least-square based RLS (recursive least square) algorithm, and two variants of the stochastic gradient descent-based algorithm ODL (online dictionary learning) with different regularization functions. Specifically, we learn the dictionary with the USPOP2002 dataset and perform genre classification with the GTZAN dataset. Our result shows that setting strict sparsity constraints in the RLS-based dictionary learning (i.e., <;1% of the signal dimension) leads to better accuracy in genre classification (around 80% when linear kernel support vector classifier is adopted). Moreover, we find that different sparsity constraints are needed for the dictionary learning phase and the encoding phase. Important links between dictionary properties and classification accuracy are also identified, such as a strong correlation between reconstruction error and classification accuracy in all algorithms. These findings help the design of future dictionary-based MIR systems and the selection of important dictionary learning parameters.

Keywords :

audio databases; gradient methods; information retrieval; learning (artificial intelligence); least squares approximations; music; pattern classification; stochastic processes; GTZAN dataset; MIR community; RLS-based dictionary learning algorithms; USPOP2002 dataset; classification accuracy; dictionary coherence; dictionary properties; dictionary sparsity; dictionary-based MIR systems; genre classification; large-scale music database; least-square based RLS algorithm; linear kernel support vector classifier; music information retrieval community; online dictionary learning; performance evaluation; reconstruction error; recursive least square algorithm; regularization functions; signal dimension; sparsity constraints; stochastic gradient descent-based algorithm ODL; Accuracy; Algorithm design and analysis; Classification algorithms; Dictionaries; Encoding; Kernel; Support vector machines;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific

Conference_Location :

Kaohsiung

Type :

conf

DOI :

10.1109/APSIPA.2013.6694278

Filename :

6694278

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=661416