• DocumentCode
    974812
  • Title

    A Cascaded Broadcast News Highlighter

  • Author

    Christensen, Heidi ; Gotoh, Yoshihiko ; Renals, Steve

  • Author_Institution
    Dept. of Comput. Sci., Sheffield Univ., Sheffield
  • Volume
    16
  • Issue
    1
  • fYear
    2008
  • Firstpage
    151
  • Lastpage
    161
  • Abstract
    This paper presents a fully automatic news skimming system which takes a broadcast news audio stream and provides the user with the segmented, structured, and highlighted transcript. This constitutes a system with three different, cascading stages: converting the audio stream to text using an automatic speech recognizer, segmenting into utterances and stories, and finally determining which utterance should be highlighted using a saliency score. Each stage must operate on the erroneous output from the previous stage in the system, an effect which is naturally amplified as the data progresses through the processing stages. We present a large corpus of transcribed broadcast news data enabling us to investigate to which degree information worth highlighting survives this cascading of processes. Both extrinsic and intrinsic experimental results indicate that mistakes in the story boundary detection has a strong impact on the quality of highlights, whereas erroneous utterance boundaries cause only minor problems. Further, the difference in transcription quality does not affect the overall performance greatly.
  • Keywords
    audio signal processing; broadcasting; speech recognition; audio stream; automatic news skimming system; automatic speech recognizer; cascaded news broadcasting; Audio recording; Automatic speech recognition; Data mining; Digital audio broadcasting; Natural languages; Radio broadcasting; Speech processing; Streaming media; TV broadcasting; Text recognition; Information extraction; speech understanding; spoken language processing; statistical modeling;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2007.910746
  • Filename
    4383075