• DocumentCode
    323547
  • Title

    A hidden Markov model approach to text segmentation and event tracking

  • Author

    Yamron, J.P. ; Carp, I. ; Gillick, L. ; Lowe, S. ; van Mulbregt, P.

  • Author_Institution
    Dragon Syst. Inc., Newton, MA, USA
  • Volume
    1
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    333
  • Abstract
    Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. For these techniques to be easily applicable, it is highly desirable that the transcripts be segmented into stories. This paper introduces a general methodology based on HMMs and on classical language modeling techniques for automatically inferring story boundaries and for retrieving stories relating to a specific event. In this preliminary work, we report some highly promising results on accurate text. Future work will apply these techniques to errorful transcripts
  • Keywords
    broadcasting; hidden Markov models; information retrieval; natural languages; speech processing; speech recognition; word processing; HMM; automatic transcription; broadcast speech; event tracking; hidden Markov model; information retrieval techniques; language modeling; speech recognition; story boundaries; text segmentation; transcripts; Broadcasting; Computer errors; Data mining; Hidden Markov models; Information filtering; Information filters; Information retrieval; Speech recognition; Streaming media; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.674435
  • Filename
    674435