• DocumentCode
    239732
  • Title

    Audio surveillance under noisy conditions using time-frequency image feature

  • Author

    Sharan, Roneel V. ; Moir, T.J.

  • Author_Institution
    Sch. of Eng., Auckland Univ. of Technol., Auckland, New Zealand
  • fYear
    2014
  • fDate
    20-23 Aug. 2014
  • Firstpage
    130
  • Lastpage
    135
  • Abstract
    In this paper, we use the novel method of using features extracted from the time-frequency image representation of a sound signal in an audio surveillance application. In particular, we investigate two image representations: linear grayscale and log grayscale. We first divide a sound signal into smaller frames and apply a windowing function. The absolute value of the Discrete Fourier Transform of each frame is then computed and normalized to get the intensity values for the linear grayscale image. The generation of the log grayscale image takes a similar approach but we take log power of the values before data normalization. Each image is then divided into blocks and central moments are computed in each block. We carry out experimentation under different noise conditions and varying signal-to-noise ratio using support vector machines for classification. Based on the classification accuracy, the linear grayscale image approach is found to be more noise robust than the log grayscale image approach. It was also found to perform better than using mel-frequency cepstral coefficients as features which is a common baseline feature in most sound recognition applications.
  • Keywords
    audio signal processing; cepstral analysis; discrete Fourier transforms; feature extraction; image classification; image representation; noise; support vector machines; surveillance; time-frequency analysis; audio surveillance application; data normalization; discrete Fourier transform; feature extraction; image classification; image representation; linear grayscale image; log grayscale image; mel-frequency cepstral coefficient; noisy condition; signal-to-noise ratio; sound recognition application; sound signal; support vector machine; time-frequency image feature; windowing function; Accuracy; Feature extraction; Gray-scale; Noise; Spectrogram; Time-frequency analysis; Training; audio surveillance; central moments; linear grayscale; log grayscale; signal-to-noise ratio; sound recognition; spectrogram; time-frequency image;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Signal Processing (DSP), 2014 19th International Conference on
  • Conference_Location
    Hong Kong
  • Type

    conf

  • DOI
    10.1109/ICDSP.2014.6900815
  • Filename
    6900815