• DocumentCode
    5966
  • Title

    Robust Sound Event Classification Using Deep Neural Networks

  • Author

    Mcloughlin, Ian ; Haomin Zhang ; Zhipeng Xie ; Yan Song ; Wei Xiao

  • Author_Institution
    Nat. Eng. Lab. of Speech & Language Inf. Process., Univ. of Sci. & Technol. of China, Hefei, China
  • Volume
    23
  • Issue
    3
  • fYear
    2015
  • fDate
    Mar-15
  • Firstpage
    540
  • Lastpage
    552
  • Abstract
    The automatic recognition of sound events by computers is an important aspect of emerging applications such as automated surveillance, machine hearing and auditory scene understanding. Recent advances in machine learning, as well as in computational models of the human auditory system, have contributed to advances in this increasingly popular research field. Robust sound event classification, the ability to recognise sounds under real-world noisy conditions, is an especially challenging task. Classification methods translated from the speech recognition domain, using features such as mel-frequency cepstral coefficients, have been shown to perform reasonably well for the sound event classification task, although spectrogram-based or auditory image analysis techniques reportedly achieve superior performance in noise. This paper outlines a sound event classification framework that compares auditory image front end features with spectrogram image-based front end features, using support vector machine and deep neural network classifiers. Performance is evaluated on a standard robust classification task in different levels of corrupting noise, and with several system enhancements, and shown to compare very well with current state-of-the-art classification techniques.
  • Keywords
    acoustic signal processing; feature extraction; neural nets; signal classification; support vector machines; DNN; auditory image front end feature; deep neural network; sound event classification; spectrogram image-based front end feature; support vector machine; Auditory system; Feature extraction; Spectrogram; Speech; Speech processing; Support vector machines; Vectors; Auditory event detection; machine hearing;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2389618
  • Filename
    7003973