• DocumentCode
    661416
  • Title

    Analyzing the dictionary properties and sparsity constraints for a dictionary-based music genre classification system

  • Author

    Ping-Keng Jao ; Li Su ; Yi-Hsuan Yang

  • Author_Institution
    Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
  • fYear
    2013
  • fDate
    Oct. 29 2013-Nov. 1 2013
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Learning dictionaries from a large-scale music database is a burgeoning research topic in the music information retrieval (MIR) community. It has been shown that classification systems based on such learned features exhibit state-of-the-art accuracy in many music classification benchmarks. Although the general approach of dictionary-based MIR has been shown effective, little work has been done to investigate the relationship between system performance and dictionary properties, such as the dictionary sparsity, coherence, and conditional number of the dictionary. This paper aims at addressing this issue by systematically evaluating the performance of three types of dictionary learning algorithms for the task of genre classification, including the least-square based RLS (recursive least square) algorithm, and two variants of the stochastic gradient descent-based algorithm ODL (online dictionary learning) with different regularization functions. Specifically, we learn the dictionary with the USPOP2002 dataset and perform genre classification with the GTZAN dataset. Our result shows that setting strict sparsity constraints in the RLS-based dictionary learning (i.e., <;1% of the signal dimension) leads to better accuracy in genre classification (around 80% when linear kernel support vector classifier is adopted). Moreover, we find that different sparsity constraints are needed for the dictionary learning phase and the encoding phase. Important links between dictionary properties and classification accuracy are also identified, such as a strong correlation between reconstruction error and classification accuracy in all algorithms. These findings help the design of future dictionary-based MIR systems and the selection of important dictionary learning parameters.
  • Keywords
    audio databases; gradient methods; information retrieval; learning (artificial intelligence); least squares approximations; music; pattern classification; stochastic processes; GTZAN dataset; MIR community; RLS-based dictionary learning algorithms; USPOP2002 dataset; classification accuracy; dictionary coherence; dictionary properties; dictionary sparsity; dictionary-based MIR systems; genre classification; large-scale music database; least-square based RLS algorithm; linear kernel support vector classifier; music information retrieval community; online dictionary learning; performance evaluation; reconstruction error; recursive least square algorithm; regularization functions; signal dimension; sparsity constraints; stochastic gradient descent-based algorithm ODL; Accuracy; Algorithm design and analysis; Classification algorithms; Dictionaries; Encoding; Kernel; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
  • Conference_Location
    Kaohsiung
  • Type

    conf

  • DOI
    10.1109/APSIPA.2013.6694278
  • Filename
    6694278