Author_Institution :
Div. of Inf. Syst., Univ. of Aizu, Aizu-Wakamatsu, Japan
Abstract :
Gaussian Processes (GPs) are Bayesian nonparametric models that are becoming more and more popular for their superior capabilities to capture highly nonlinear data relationships in various tasks, such as dimensionality reduction, time series analysis, novelty detection, as well as classical regression and classification tasks. In this paper, we investigate the feasibility and applicability of GP models for music genre classification and music emotion estimation. These are two of the main tasks in the music information retrieval (MIR) field. So far, the support vector machine (SVM) has been the dominant model used in MIR systems. Like SVM, GP models are based on kernel functions and Gram matrices; but, in contrast, they produce truly probabilistic outputs with an explicit degree of prediction uncertainty. In addition, there exist algorithms for GP hyperparameter learning-something the SVM framework lacks. In this paper, we built two systems, one for music genre classification and another for music emotion estimation using both SVM and GP models, and compared their performances on two databases of similar size. In all cases, the music audio signal was processed in the same way, and the effects of different feature extraction methods and their various combinations were also investigated. The evaluation experiments clearly showed that in both music genre classification and music emotion estimation tasks the GP performed consistently better than the SVM. The GP achieved a 13.6% relative genre classification error reduction and up to an 11% absolute increase of the coefficient of determination in the emotion estimation task.
Keywords :
Bayes methods; Gaussian processes; audio signal processing; emotion recognition; feature extraction; information retrieval; learning (artificial intelligence); music; nonparametric statistics; pattern classification; support vector machines; Bayesian nonparametric model; GP hyperparameter learning; Gaussian process; MIR system; SVM; degree of prediction uncertainty; feature extraction method; music audio signal processing; music emotion recognition; music genre classification error reduction; music genre recognition; music information retrieval; nonlinear data relationship; support vector machine; Acoustic measurement; Analytical models; Bayes methods; Data models; Emotion recognition; Gaussian processes; Information retrieval; Music; Regression analysis; Support vector machines; Time series analysis; Gaussian processes; Music genre classification; gaussian processes; music emotion estimation; music genre classification;