DocumentCode :
825220
Title :
Gaussian Mixture Modeling of Short-Time Fourier Transform Features for Audio Fingerprinting
Author :
Ramalingam, A. ; Krishnan, S.
Author_Institution :
Dept. of Electr. & Comput. Eng., Ryerson Inst., Toronto, Ont.
Volume :
1
Issue :
4
fYear :
2006
Firstpage :
457
Lastpage :
463
Abstract :
In audio fingerprinting, an audio clip must be recognized by matching an extracted fingerprint to a database of previously computed fingerprints. The fingerprints should reduce the dimensionality of the input significantly, provide discrimination among different audio clips, and, at the same time, be invariant to distorted versions of the same audio clip. In this paper, we design fingerprints addressing the above issues by modeling an audio clip by Gaussian mixture models (GMM). We evaluate the performance of many easy-to-compute short-time Fourier transform features, such as Shannon entropy, Renyi entropy, spectral centroid, spectral bandwidth, spectral flatness measure, spectral crest factor, and Mel-frequency cepstral coefficients in modeling audio clips using GMM for fingerprinting. We test the robustness of the fingerprints under a large number of distortions. To make the system robust, we use some of the distorted versions of the audio for training. However, we show that the audio fingerprints modeled using GMM are not only robust to the distortions used in training but also to distortions not used in training. Among the features tested, spectral centroid performs best with an identification rate of 99.2% at a false positive rate of 10-4. All of the features give an identification rate of more than 90% at a false positive rate of 10-3
Keywords :
Fourier transforms; Gaussian processes; audio signal processing; cepstral analysis; security of data; Gaussian mixture modeling; Mel-frequency cepstral coefficient; Renyi entropy; Shannon entropy; audio clip; audio fingerprinting; fingerprint extraction; short-time Fourier transform feature; spectral bandwidth; spectral centroid; spectral crest factor; spectral flatness measure; Audio databases; Bandwidth; Cepstral analysis; Distortion measurement; Entropy; Fingerprint recognition; Fourier transforms; Robustness; Spatial databases; Testing; Audio fingerprinting; Gaussian mixture models; automatic song identification;
fLanguage :
English
Journal_Title :
Information Forensics and Security, IEEE Transactions on
Publisher :
ieee
ISSN :
1556-6013
Type :
jour
DOI :
10.1109/TIFS.2006.885036
Filename :
4014108
Link To Document :
بازگشت