مرکز منطقه ای اطلاع رساني علوم و فناوري - Distortion discriminant analysis for audio fingerprinting

DocumentCode :

1226678

Title :

Distortion discriminant analysis for audio fingerprinting

Author :

Burges, Christopher J.C. ; Platt, John C. ; Jana, Soumya

Author_Institution :

Microsoft Res., Redmond, WA, USA

Volume :

Issue :

fYear :

2003

fDate :

5/1/2003 12:00:00 AM

Firstpage :

165

Lastpage :

174

Abstract :

Mapping audio data to feature vectors for the classification, retrieval or identification tasks presents four principal challenges. The dimensionality of the input must be significantly reduced; the resulting features must be robust to likely distortions of the input; the features must be informative for the task at hand; and the feature extraction operation must be computationally efficient. We propose distortion discriminant analysis (DDA), which fulfills all four of these requirements. DDA constructs a linear, convolutional neural network out of layers, each of which performs an oriented PCA dimensional reduction. We demonstrate the effectiveness of DDA on two audio fingerprinting tasks: searching for 500 audio clips in 36 h of audio test data; and playing over 10 days of audio against a database with approximately 240 000 fingerprints. We show that the system is robust to kinds of noise that are not present in the training procedure. In the large test, the system gives a false positive rate of 1.5 × 10^-8 per audio clip, per fingerprint, at a false negative rate of 0.2% per clip.

Keywords :

audio databases; audio signal processing; convolution; distortion; feature extraction; learning (artificial intelligence); neural nets; principal component analysis; signal classification; DDA; PCA dimensional reduction; audio classification; audio clips; audio data mapping; audio fingerprinting; audio identification; audio retrieval; audio test data; database; distortion discriminant analysis; false negative rate; false positive rate; feature extraction; feature vectors; input dimensionality reduction; linear convolutional neural network; training; Audio databases; Convolution; Feature extraction; Fingerprint recognition; Information retrieval; Neural networks; Noise robustness; Nonlinear distortion; Streaming media; Testing;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2003.811538

Filename :

1208286

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1226678