• DocumentCode
    3349485
  • Title

    Content-based retrieval of MP3 songs based on query by singing

  • Author

    Lie, Wen-Nung ; Su, Chen-Kang

  • Author_Institution
    Dept. of Electr. Eng., Nat. Chung Cheng Univ., Taiwan
  • Volume
    5
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    With the growth of multimedia in the Internet, content analysis of multimedia plays an important role for humanistic management. We investigate the content-based retrieval of MP3 songs based on the interface of query by singing. MDCT (modified DCT) spectral coefficients are directly used to represent the tonic characteristics of a short-term sound. This spectral profile is used for detailed matching between two audio segments. Perceptual features are also computed from MDCT coefficients for audio classification. Two pre-stages based on SVM and k-means classifications are used to remove incorrect (or noisy) segment candidates and to speed up the subsequent matching process. On the other hand, exponential key-scaling schemes and time-warping techniques are developed to overcome key difference and tempo variation between different singers. Experiments show that the retrieval probability of our design can achieve up to 76% among the top 5 out of a total of 114 excerpts in the database.
  • Keywords
    audio signal processing; content-based retrieval; discrete cosine transforms; multimedia systems; music; pattern classification; pattern matching; probability; signal classification; spectral analysis; support vector machines; Internet; MDCT spectral coefficients; MP3 songs; SVM; audio classification; content-based retrieval; k-means classifications; key-scaling schemes; modified DCT spectral coefficients; multimedia content analysis; query by singing; retrieval probability; spectral profile; time-warping techniques; tonic characteristics; Acoustic noise; Content based retrieval; Content management; Digital audio players; Discrete cosine transforms; Information retrieval; Internet; Music information retrieval; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1327264
  • Filename
    1327264