• DocumentCode
    427161
  • Title

    Detecting discussion scenes in instructional videos

  • Author

    Li, Eng ; Dorai, Chirra

  • Author_Institution
    IBM T. J. Watson Res. Center, Yorktown Heights, NY
  • Volume
    2
  • fYear
    2004
  • fDate
    30-30 June 2004
  • Firstpage
    1311
  • Abstract
    This paper addresses the problem of detecting discussion scenes in instructional videos using statistical approaches. Specifically, given a series of speech segments separated from the audio tracks of educational videos, we first model the instructor using a Gaussian mixture model (GMM), then a four-state transition machine is designed to extract discussion scenes in real-time, based on detected instructor-student speaker change points. Meanwhile, we keep updating the GMM model to accommodate the instructor´s voice variation along time. Promising experimental results have been achieved on five educational (IBM MicroMBA program) videos, and very interesting instruction/teaching patterns have been observed. The extracted scene information would facilitate the semantic indexing and structuralization of instructional video content
  • Keywords
    Gaussian distribution; audio signal processing; feature extraction; speech processing; GMM; Gaussian mixture model; audio track separated speech segments; instruction/teaching patterns; instructional video discussion scene detection; instructor voice variation; instructor-student speaker change points; multiple-state transition machine; scene information extraction; semantic indexing; video structuralization; Cameras; Data mining; Educational institutions; Educational programs; Electronic learning; Hidden Markov models; Indexing; Layout; Speech; Videos;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2004. ICME '04. 2004 IEEE International Conference on
  • Conference_Location
    Taipei
  • Print_ISBN
    0-7803-8603-5
  • Type

    conf

  • DOI
    10.1109/ICME.2004.1394468
  • Filename
    1394468