Title :
Semantic rank reduction of music audio
Author_Institution :
Media Lab., MIT, Cambridge, MA, USA
Abstract :
Audio understanding and classification tasks are often aided by a reduced dimensionality representation of the source observations. For example, a supervised learning system trained to detect the genre or artist of a piece of music performs better if the input nodes are statistically decorrelated, either to prevent overfitting in the learning process or to ´anchor´ similar observations to cluster centroids in the observation space. We provide an alternative approach that decomposes audio observations of music into semantically significant dimensions where each resultant dimension corresponds to the perceived meaning of the audio, and only the most significant meanings (those which are most effective in describing music audio) are kept. We show a fundamentally unsupervised method to obtain this decomposition automatically and compare its performance in a music understanding task against statistical decorrelation approaches such as PCA and non-negative matrix factorization (NMF).
Keywords :
audio signal processing; decorrelation; learning (artificial intelligence); music; pattern classification; principal component analysis; signal classification; PCA; audio classification; audio decomposition; audio understanding; cluster centroids; confusion matrices; learning process; music audio; music understanding; nonnegative matrix factorization; semantic rank reduction; statistical decorrelation; supervised learning system; Acoustic noise; Cultural differences; Internet; Joining processes; Matrix decomposition; Multiple signal classification; Music information retrieval; Noise figure; Principal component analysis; Supervised learning;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on.
Print_ISBN :
0-7803-7850-4
DOI :
10.1109/ASPAA.2003.1285838