• DocumentCode
    740621
  • Title

    An Overview on Perceptually Motivated Audio Indexing and Classification

  • Author

    Richard, Guilhem ; Sundaram, Suresh ; Narayanan, Shrikanth

  • Author_Institution
    Telecom ParisTech, Paris, France
  • Volume
    101
  • Issue
    9
  • fYear
    2013
  • Firstpage
    1939
  • Lastpage
    1954
  • Abstract
    An audio indexing system aims at describing audio content by identifying, labeling, or categorizing different acoustic events. Since the resulting audio classification and indexing is meant for direct human consumption, it is highly desirable that it produces perceptually relevant results. This can be obtained by integrating specific knowledge of the human auditory system in the design process to various extent. In this paper, we highlight some of the important concepts used in audio classification and indexing that are perceptually motivated or that exploit some principles of perception. In particular, we discuss several different strategies to integrate human perception, including: 1) the use of generic audition models; 2) the use of perceptually relevant features for the analysis stage that are perceptually justified either as a component of a hearing model or as being correlated with a perceptual dimension of sound similarity; and 3) the involvement of the user in the audio indexing or classification task. In this paper, we also illustrate some of the recent trends in semantic audio retrieval that approximate higher level perceptual processing and cognitive aspects of human audio recognition capabilities, including affect-based audio retrieval.
  • Keywords
    audio signal processing; cognition; indexing; information retrieval; pattern classification; acoustic events; affect-based audio retrieval; approximate higher level perceptual processing; cognitive aspects; direct human consumption; generic audition models; human audio recognition capabilities; human auditory system; perceptually motivated audio classification; perceptually motivated audio indexing; perceptually relevant features; Acoustics; Auditory system; Filter banks; Frequency modulation; Indexing; Labeling; Time-frequency analysis; Affect-based audio retrieval; audio classification; audio indexing; music indexing; music information retrieval; musical timbre recognition; perceptual audio features; perceptual signal representations; semantic audio retrieval;
  • fLanguage
    English
  • Journal_Title
    Proceedings of the IEEE
  • Publisher
    ieee
  • ISSN
    0018-9219
  • Type

    jour

  • DOI
    10.1109/JPROC.2013.2251591
  • Filename
    6560388