• DocumentCode
    3246121
  • Title

    Automatic indexing of multimedia content by integration of audio, spoken language, and visual information

  • Author

    Ohtsuki, Katsutoshi ; Bessho, Katsuji ; Matsuo, Yoshihiro ; Matsunaga, Shoichi ; Hayashi, Yoshihiko

  • Author_Institution
    NTT Cyber Space Labs., NTT Corp., Kanagawa, Japan
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    601
  • Lastpage
    606
  • Abstract
    This paper describes an automatic multimedia content indexing system that includes acoustic segmentation, automatic speech recognition, topic segmentation, and video indexing features. The system is intended for indexing of multimedia news programs. Speech segments extracted from news content are delivered to the speech recognition module. The speech recognition result is segmented into topics using a segmentation algorithm based on word conceptual vectors. The indexing results derived from audio and speech information are integrated with video indexing results to extract the story structure. Experimental results show that topic segmentation using word conceptual vectors is superior to the conventional method using local word co-occurrence frequencies, and that the integrated segmentation provides better news story structures than would be possible with any single type of information.
  • Keywords
    feature extraction; indexing; multimedia computing; natural languages; speech recognition; video signal processing; acoustic segmentation; automatic multimedia content indexing; automatic speech recognition; integrated news story structure extraction; multimedia news programs; natural language processing technology; spoken language; topic segmentation; video indexing; word conceptual vectors; Automatic speech recognition; Broadcasting; Content based retrieval; Data mining; Machine assisted indexing; Multimedia systems; Natural languages; Speech enhancement; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318508
  • Filename
    1318508