Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio

Author

Virtanen, Tuomas ; Gemmeke, Jort F. ; Raj, Bhiksha

Author_Institution

Dept. of Signal Process., Tampere Univ. of Technol., Tampere, Finland

Volume

21

Issue

11

fYear

2013

fDate

Nov. 2013

Firstpage

2277

Lastpage

2289

Abstract

This paper proposes a computationally efficient algorithm for estimating the non-negative weights of linear combinations of the atoms of large-scale audio dictionaries, so that the generalized Kullback-Leibler divergence between an audio observation and the model is minimized. This linear model has been found useful in many audio signal processing tasks, but the existing algorithms are computationally slow when a large number of atoms is used. The proposed algorithm is based on iteratively updating a set of active atoms, with the weights updated using the Newton method and the step size estimated such that the weights remain non-negative. Algorithm convergence evaluations on representing audio spectra that are mixtures of two speakers show that with all the tested dictionary sizes the proposed method reaches a much lower value of the divergence than can be obtained by conventional algorithms, and is up to 8 times faster. A source separation evaluation revealed that when using large dictionaries, the proposed method produces a better separation quality in less time.

Keywords

Newton method; audio signal processing; signal representation; active atoms; active-set Newton algorithm; algorithm convergence evaluations; audio observation; audio signal processing tasks; audio spectra; computationally efficient algorithm; generalized Kullback-Leibler divergence; large-scale audio dictionaries; linear combinations; linear model; overcomplete nonnegative audio representations; separation quality; source separation evaluation; tested dictionary sizes; Acoustic signal analysis; Large scale systems; Optimization; Pattern recognition; Source separation; Acoustic signal analysis; Newton algorithm; audio source separation; convex optimization; non-negative matrix factorization; sparse coding; sparse representation; supervised source separation;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TASL.2013.2263144

Filename

6516060