• DocumentCode
    3226432
  • Title

    A New Multimedia Content Skimming Method Based on Speech Emphasis Extraction and Its Application to Content Variations

  • Author

    Hidaka, Kota ; Nakajima, Shinya ; Niihara, Yasuyuki

  • Author_Institution
    NTT Cyber Solutions Labs., NTT East Corp.
  • fYear
    2006
  • fDate
    Dec. 2006
  • Firstpage
    716
  • Lastpage
    719
  • Abstract
    We propose Choco-Para, a multimedia content skimming technique; its application to a variety of content types is described. Based on automatic speech emphasis extraction, Choco-Para extracts speech attributes, prosodic parameters such as pitch, power, and speaking rate, and uses the data to estimate the degree of emphasis of each spoken phrase. By computing the degree of the emphasis curve, Choco-Para can generate a skimmed edition at an arbitrary skimming rate by selecting emphasized speech portions via dynamic threshold logic. Choco-Para uses three types of prosodic parameters and both short term and long term deviation. Experiments assess the contributions of each prosodic parameter and deviation type. They show that estimation accuracy is optimized by using both short and long term deviation with regard to pitch, power, and speaking rate. The results confirm that Choco-Para supports a wide variety of multimedia content
  • Keywords
    content management; feature extraction; speech recognition; Choco-Para; automatic speech emphasis extraction; content variation; emphasis curve; multimedia content skimming technique; prosodic parameters; Automatic speech recognition; Broadband communication; Data mining; Explosions; Face recognition; Logic; Mobile handsets; Multimedia systems; Speech analysis; Streaming media;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia, 2006. ISM'06. Eighth IEEE International Symposium on
  • Conference_Location
    San Diego, CA
  • Print_ISBN
    0-7695-2746-9
  • Type

    conf

  • DOI
    10.1109/ISM.2006.6
  • Filename
    4061239