Title :
Bayesian nonparametric music parser
Author :
Nakano, Masahiro ; Ohishi, Yasunori ; Kameoka, Hirokazu ; Mukai, Ryo ; Kashino, Kunio
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Atsugi, Japan
Abstract :
This paper proposes a novel representation of music that can be used for similarity-based music information retrieval, and also presents a method that converts an input polyphonic audio signal to the proposed representation. The representation involves a 2-dimensional tree structure, where each node encodes the musical note and the dimensions correspond to the time and simultaneous multiple notes, respectively. Since the temporal structure and the synchrony of simultaneous events are both essential in music, our representation reflects them explicitly. In the conventional approaches to music representation from audio, note extraction is usually performed prior to structure analysis, but accurate note extraction has been a difficult task. In the proposed method, note extraction and structure estimation is performed simultaneously and thus the optimal solution is obtained with a unified inference procedure. That is, we propose an extended 2-dimensional infinite probabilistic context-free grammar and a sparse factor model for spectrogram analysis. An efficient inference algorithm, based on Markov chain Monte Carlo sampling and dynamic programming, is presented. The experimental results show the effectiveness of the proposed approach.
Keywords :
Bayes methods; Markov processes; Monte Carlo methods; audio signal processing; dynamic programming; electronic music; information retrieval; 2-dimensional infinite probabilistic context-free grammar; 2-dimensional tree structure; Bayesian nonparametric music parser; Markov chain Monte Carlo sampling; dynamic programming; inference algorithm; music representation; musical note encoding; note extraction; polyphonic audio signal; similarity-based music information retrieval; sparse factor; spectrogram analysis; structure estimation; temporal structure; unified inference procedure; Abstracts; Bayesian methods; Biological system modeling; Continuous wavelet transforms; Indexes; Markov chain Monte Carlo (MCMC); hierarchical Dirichlet process (HDP); infinite probabilistic context-free grammar (infinite PCFG); nonnegative matrix factorization (NMF);
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6287916