DocumentCode :
22887
Title :
Deep Scattering Spectrum
Author :
Anden, J. ; Mallat, S.
Author_Institution :
Centre de Math. Appl., Ecole Polytech., Palaiseau, France
Volume :
62
Issue :
16
fYear :
2014
fDate :
Aug.15, 2014
Firstpage :
4114
Lastpage :
4128
Abstract :
A scattering transform defines a locally translation invariant representation which is stable to time-warping deformation. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively.
Keywords :
acoustic wave scattering; amplitude modulation; audio signal processing; cepstral analysis; signal classification; signal representation; GTZAN database; MFCC; TIMIT database; audio classification; deep scattering spectrum; frequency transposition invariant representation; mel-frequency cepstral coefficients; modulus operators; musical genre; phone classification; scattering transform; second-order scattering coefficients; spectrum coefficients; time-warping deformation; transient phenomena; wavelet convolutions; Convolution; Frequency modulation; Scattering; Spectrogram; Wavelet analysis; Wavelet transforms; Audio classification; MFCC; deep neural networks; modulation spectrum; wavelets;
fLanguage :
English
Journal_Title :
Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1053-587X
Type :
jour
DOI :
10.1109/TSP.2014.2326991
Filename :
6822556
Link To Document :
بازگشت