• DocumentCode
    744405
  • Title

    Vocal Removal From Multiobject Audio Using Harmonic Information for Karaoke Service

  • Author

    Jihoon Park ; Kwangki Kim ; Minsoo Hahn

  • Author_Institution
    Korea Adv. Inst. of Sci. & Technol., Daejeon, South Korea
  • Volume
    21
  • Issue
    4
  • fYear
    2013
  • fDate
    4/1/2013 12:00:00 AM
  • Firstpage
    798
  • Lastpage
    805
  • Abstract
    Interactive audio services (IASs) usually provide users with audio editing functionality and they can render their own sounds according to their preference. For IASs, the spatial audio object coding (SAOC) is an appropriate multichannel coding tool that satisfies most of the required functionalities with relatively low bit rate. Nevertheless, the SAOC usually fails to remove a specific object successfully, especially the vocal object in the case of the Karaoke service. In addition, to expand the service to mobile environments, lower bit rate and complexity are required. Thus, we propose a new SAOC vocal harmonic coding technique to improve the background music quality in the Karaoke service. Namely, utilizing the harmonic information of the vocal object, we removed the harmonics of the vocal object remaining in the background music. Our experimental results confirm that the background music quality is improved by the proposed algorithm even with the low bit rate and complexity.
  • Keywords
    audio coding; audio equipment; channel coding; electronic music; interactive devices; IAS; SAOC vocal harmonic coding technique; audio editing functionality; background music quality; harmonic information; interactive audio services; karaoke service; mobile environments; multichannel coding tool; multiobject audio; spatial audio object coding; vocal object; vocal removal; Bit rate; Complexity theory; Decoding; Encoding; Harmonic analysis; Power harmonic filters; Quantization; Audio object; Karaoke service; spatial audio object coding; vocal harmonic information;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2012.2234116
  • Filename
    6400233