Title :
On the Information Geometry of Audio Streams With Applications to Similarity Computing
Author :
Cont, Arshia ; Dubnov, Shlomo ; Assayag, Gérard
Author_Institution :
Inst. of Res. for Coordination of Acoust. & Music (IRCAM), Paris, France
fDate :
5/1/2011 12:00:00 AM
Abstract :
This paper proposes methods for information processing of audio streams using methods of information geometry. We lay the theoretical groundwork for a framework allowing the treatment of signal information as information entities, suitable for similarity and symbolic computing on audio signals. The theoretical basis of this paper is based on the information geometry of statistical structures representing audio spectrum features, and specifically through the bijection between the generic families of Bregman divergences and that of exponential distributions. The proposed framework, called Music Information Geometry, allows online segmentation of audio streams to metric balls where each ball represents a quasi-stationary continuous chunk of audio, and discusses methods to qualify and quantify information between entities for similarity computing. We define an information geometry that approximates a similarity metric space, redefine general notions in music information retrieval such as similarity between entities, and address methods for dealing with nonstationarity of audio signals. We demonstrate the framework on two sample applications for online audio structure discovery and audio matching.
Keywords :
audio signal processing; audio streaming; exponential distribution; geometry; information retrieval; music; statistical analysis; Bregman divergences; audio matching; audio signals; audio spectrum features; audio streams; exponential distributions; generic family; information entity; information processing; metric balls; music information geometry; music information retrieval; online audio structure discovery; online segmentation; quasi-stationary continuous chunk; signal information; similarity computing; similarity metric space; statistical structures; symbolic computing; Information geometry; music information retrieval (MIR);
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2010.2066266