• DocumentCode
    3275697
  • Title

    An attribute-based approach to audio description applied to segmenting vocal sections in popular music songs

  • Author

    Sundaram, Shiva ; Narayanan, Shrikanth

  • Author_Institution
    Dept. of Electr. Eng.-Syst., Univ. of Southern California, Los Angeles, CA
  • fYear
    2006
  • fDate
    3-6 Oct. 2006
  • Firstpage
    103
  • Lastpage
    107
  • Abstract
    We present a descriptive approach for analyzing audio scenes that can comprise a mixture of audio sources. We apply this method to segment popular music songs into vocal and non-vocal sections. Unlike existing methods that directly rely on within-class feature similarities of acoustic sources, the proposed data-driven system is based on a training set where the acoustic sources are grouped by their perceptual or semantic attributes. Our audio analysis approach is based on a quantitative time-varying metric to measure the interaction between acoustic sources present in a scene developed using pattern recognition methods. Using the proposed system that is trained on a general sound effects library, we achieve less than ten percent vocal-section segmentation error and less than five percent false alarm rates when evaluated on a database of popular music recordings that spans four different genres (rock, hiphop, pop, and easy listening)
  • Keywords
    audio databases; audio signal processing; musical acoustics; pattern recognition; time-varying systems; audio sources; data-driven system; music recordings database; music songs; pattern recognition methods; time-varying metric; vocal sections segmentation; Acoustic measurements; Audio recording; Databases; Laboratories; Layout; Libraries; Music information retrieval; Pattern analysis; Pattern recognition; Speech analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Signal Processing, 2006 IEEE 8th Workshop on
  • Conference_Location
    Victoria, BC
  • Print_ISBN
    0-7803-9751-7
  • Electronic_ISBN
    0-7803-9752-5
  • Type

    conf

  • DOI
    10.1109/MMSP.2006.285277
  • Filename
    4064527