Title :
Audio surveillance under noisy conditions using time-frequency image feature
Author :
Sharan, Roneel V. ; Moir, T.J.
Author_Institution :
Sch. of Eng., Auckland Univ. of Technol., Auckland, New Zealand
Abstract :
In this paper, we use the novel method of using features extracted from the time-frequency image representation of a sound signal in an audio surveillance application. In particular, we investigate two image representations: linear grayscale and log grayscale. We first divide a sound signal into smaller frames and apply a windowing function. The absolute value of the Discrete Fourier Transform of each frame is then computed and normalized to get the intensity values for the linear grayscale image. The generation of the log grayscale image takes a similar approach but we take log power of the values before data normalization. Each image is then divided into blocks and central moments are computed in each block. We carry out experimentation under different noise conditions and varying signal-to-noise ratio using support vector machines for classification. Based on the classification accuracy, the linear grayscale image approach is found to be more noise robust than the log grayscale image approach. It was also found to perform better than using mel-frequency cepstral coefficients as features which is a common baseline feature in most sound recognition applications.
Keywords :
audio signal processing; cepstral analysis; discrete Fourier transforms; feature extraction; image classification; image representation; noise; support vector machines; surveillance; time-frequency analysis; audio surveillance application; data normalization; discrete Fourier transform; feature extraction; image classification; image representation; linear grayscale image; log grayscale image; mel-frequency cepstral coefficient; noisy condition; signal-to-noise ratio; sound recognition application; sound signal; support vector machine; time-frequency image feature; windowing function; Accuracy; Feature extraction; Gray-scale; Noise; Spectrogram; Time-frequency analysis; Training; audio surveillance; central moments; linear grayscale; log grayscale; signal-to-noise ratio; sound recognition; spectrogram; time-frequency image;
Conference_Titel :
Digital Signal Processing (DSP), 2014 19th International Conference on
Conference_Location :
Hong Kong
DOI :
10.1109/ICDSP.2014.6900815