• DocumentCode
    68787
  • Title

    Music Genre and Emotion Recognition Using Gaussian Processes

  • Author

    Markov, Konstantin ; Matsui, Takashi

  • Author_Institution
    Div. of Inf. Syst., Univ. of Aizu, Aizu-Wakamatsu, Japan
  • Volume
    2
  • fYear
    2014
  • fDate
    2014
  • Firstpage
    688
  • Lastpage
    697
  • Abstract
    Gaussian Processes (GPs) are Bayesian nonparametric models that are becoming more and more popular for their superior capabilities to capture highly nonlinear data relationships in various tasks, such as dimensionality reduction, time series analysis, novelty detection, as well as classical regression and classification tasks. In this paper, we investigate the feasibility and applicability of GP models for music genre classification and music emotion estimation. These are two of the main tasks in the music information retrieval (MIR) field. So far, the support vector machine (SVM) has been the dominant model used in MIR systems. Like SVM, GP models are based on kernel functions and Gram matrices; but, in contrast, they produce truly probabilistic outputs with an explicit degree of prediction uncertainty. In addition, there exist algorithms for GP hyperparameter learning-something the SVM framework lacks. In this paper, we built two systems, one for music genre classification and another for music emotion estimation using both SVM and GP models, and compared their performances on two databases of similar size. In all cases, the music audio signal was processed in the same way, and the effects of different feature extraction methods and their various combinations were also investigated. The evaluation experiments clearly showed that in both music genre classification and music emotion estimation tasks the GP performed consistently better than the SVM. The GP achieved a 13.6% relative genre classification error reduction and up to an 11% absolute increase of the coefficient of determination in the emotion estimation task.
  • Keywords
    Bayes methods; Gaussian processes; audio signal processing; emotion recognition; feature extraction; information retrieval; learning (artificial intelligence); music; nonparametric statistics; pattern classification; support vector machines; Bayesian nonparametric model; GP hyperparameter learning; Gaussian process; MIR system; SVM; degree of prediction uncertainty; feature extraction method; music audio signal processing; music emotion recognition; music genre classification error reduction; music genre recognition; music information retrieval; nonlinear data relationship; support vector machine; Acoustic measurement; Analytical models; Bayes methods; Data models; Emotion recognition; Gaussian processes; Information retrieval; Music; Regression analysis; Support vector machines; Time series analysis; Gaussian processes; Music genre classification; gaussian processes; music emotion estimation; music genre classification;
  • fLanguage
    English
  • Journal_Title
    Access, IEEE
  • Publisher
    ieee
  • ISSN
    2169-3536
  • Type

    jour

  • DOI
    10.1109/ACCESS.2014.2333095
  • Filename
    6843353