Title :
Active-Set Newton Algorithm for Overcomplete Non-Negative Representations of Audio
Author :
Virtanen, Tuomas ; Gemmeke, Jort F. ; Raj, Bhiksha
Author_Institution :
Dept. of Signal Process., Tampere Univ. of Technol., Tampere, Finland
Abstract :
This paper proposes a computationally efficient algorithm for estimating the non-negative weights of linear combinations of the atoms of large-scale audio dictionaries, so that the generalized Kullback-Leibler divergence between an audio observation and the model is minimized. This linear model has been found useful in many audio signal processing tasks, but the existing algorithms are computationally slow when a large number of atoms is used. The proposed algorithm is based on iteratively updating a set of active atoms, with the weights updated using the Newton method and the step size estimated such that the weights remain non-negative. Algorithm convergence evaluations on representing audio spectra that are mixtures of two speakers show that with all the tested dictionary sizes the proposed method reaches a much lower value of the divergence than can be obtained by conventional algorithms, and is up to 8 times faster. A source separation evaluation revealed that when using large dictionaries, the proposed method produces a better separation quality in less time.
Keywords :
Newton method; audio signal processing; signal representation; active atoms; active-set Newton algorithm; algorithm convergence evaluations; audio observation; audio signal processing tasks; audio spectra; computationally efficient algorithm; generalized Kullback-Leibler divergence; large-scale audio dictionaries; linear combinations; linear model; overcomplete nonnegative audio representations; separation quality; source separation evaluation; tested dictionary sizes; Acoustic signal analysis; Large scale systems; Optimization; Pattern recognition; Source separation; Acoustic signal analysis; Newton algorithm; audio source separation; convex optimization; non-negative matrix factorization; sparse coding; sparse representation; supervised source separation;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2013.2263144