DocumentCode
1297308
Title
A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation
Author
Yoshii, Kazuyoshi ; Goto, Masataka
Author_Institution
Nat. Inst. of Adv. Ind. Sci. & Technol. (AIST), Tsukuba, Japan
Volume
20
Issue
3
fYear
2012
fDate
3/1/2012 12:00:00 AM
Firstpage
717
Lastpage
730
Abstract
The statistical multipitch analyzer described in this paper estimates multiple fundamental frequencies (F0s) in polyphonic music audio signals produced by pitched instruments. It is based on hierarchic4al nonparametric Bayesian models that can deal with uncertainty of unknown random variables such as model complexities (e.g., the number of F0s and the number of harmonic partials), model parameters (e.g., the values of F0s and the relative weights of harmonic partials), and hyperparameters (i.e., prior knowledge on complexities and parameters). Using these models, we propose a statistical method called infinite latent harmonic allocation (iLHA). To avoid model-complexity control, we allow the observed spectra to contain an unbounded number of sound sources (F0s), each of which is allowed to contain an unbounded number of harmonic partials. More specifically, to model a set of time-sliced spectra, we formulated nested infinite Gaussian mixture models based on hierarchical and generalized Dirichlet processes. To avoid manual tuning of influential hyperparameters, we put noninformative hyperprior distributions on them in a hierarchical manner. For efficient Bayesian inference, we used a modern technique called collapsed variational Bayes. In comparative experiments using audio recordings of piano and guitar solo performances, iLHA yielded promising results and we found that there would be room for improvement based on modeling of temporal continuity and spectral smoothness.
Keywords
Bayes methods; Gaussian processes; audio signal processing; music; statistical analysis; Bayesian inference; collapsed variational Bayes; generalized Dirichlet process; hierarchical nonparametric Bayesian model; hyperparameters; infinite latent harmonic allocation; model complexities; model parameters; model-complexity control; multiple fundamental frequencies; nested infinite Gaussian mixture model; nonparametric Bayesian multipitch analyzer; pitched instruments; polyphonic music audio signals; random variables; sound sources; statistical multipitch analyzer; time-sliced spectra; Bayesian methods; Complexity theory; Harmonic analysis; Hidden Markov models; Probabilistic logic; Psychoacoustic models; Uncertainty; Bayesian nonparametrics; Dirichlet process; infinite latent harmonic allocation (iLHA); multipitch analysis;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2011.2164530
Filename
5983482
Link To Document