مرکز منطقه ای اطلاع رساني علوم و فناوري - Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription

DocumentCode :

1414253

Title :

Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription

Author :

Bertin, Nancy ; Badeau, Roland ; Vincent, Emmanuel

Author_Institution :

Dept. Traitement du Signal et des Images, TELECOM ParisTech, Paris, France

Volume :

Issue :

fYear :

2010

fDate :

3/1/2010 12:00:00 AM

Firstpage :

538

Lastpage :

549

Abstract :

This paper presents theoretical and experimental results about constrained non-negative matrix factorization (NMF) in a Bayesian framework. A model of superimposed Gaussian components including harmonicity is proposed, while temporal continuity is enforced through an inverse-Gamma Markov chain prior. We then exhibit a space-alternating generalized expectation-maximization (SAGE) algorithm to estimate the parameters. Computational time is reduced by initializing the system with an original variant of multiplicative harmonic NMF, which is described as well. The algorithm is then applied to perform polyphonic piano music transcription. It is compared to other state-of-the-art algorithms, especially NMF-based. Convergence issues are also discussed on a theoretical and experimental point of view. Bayesian NMF with harmonicity and temporal continuity constraints is shown to outperform other standard NMF-based transcription systems, providing a meaningful mid-level representation of the data. However, temporal smoothness has its drawbacks, as far as transients are concerned in particular, and can be detrimental to transcription performance when it is the only constraint used. Possible improvements of the temporal prior are discussed.

Keywords :

Markov processes; audio signal processing; expectation-maximisation algorithm; source separation; Bayesian nonnegative matrix factorization; audio source separation; harmonicity; inverse-Gamma Markov chain; polyphonic music transcription; smoothness; space-alternating generalized expectation-maximization algorithm; superimposed Gaussian components; unsupervised machine learning; Bayesian methods; Constraint theory; Convergence; Machine learning; Matrix decomposition; Parameter estimation; Signal processing; Source separation; Streaming media; Telecommunications; Audio source separation; Bayesian regression; music transcription; non-negative matrix factorization (NMF); unsupervised machine learning;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2010.2041381

Filename :

5410052

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1414253