• DocumentCode
    417219
  • Title

    Speech discrimination based on multiscale spectro-temporal modulations

  • Author

    Mesgarani, Nima ; Shamma, Shihab ; Slaney, Malcolm

  • Author_Institution
    Neural Syst. Labratory, Maryland Univ., College Park, MD, USA
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    A novel approach for content based audio classification is presented based on multiscale spectro-temporal modulation features extracted using a model of auditory cortex. The task is to discriminate speech from non-speech which consists of animal vocalizations, music and environmental sounds. Generalization of the system to signals in high level of additive noise and reverberation is evaluated and compared to two existing approaches. The results demonstrate the advantages of the auditory model over the other two systems, especially at low SNR and high reverberation.
  • Keywords
    feature extraction; reverberation; signal classification; signal resolution; spectral analysis; speech recognition; additive noise; animal vocalizations; auditory cortex model; content based audio classification; environmental sounds; feature extraction; multiscale spectro-temporal modulations; music; reverberation; speech discrimination; Auditory system; Biomembranes; Brain modeling; Feature extraction; Filter bank; Hair; Psychoacoustic models; Reverberation; Spectrogram; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326057
  • Filename
    1326057