• DocumentCode
    2308672
  • Title

    Automatic linguistic segmentation of conversational speech

  • Author

    Stolcke, Andreas ; Shriberg, Elizabeth

  • Author_Institution
    Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    1005
  • Abstract
    As speech recognition moves toward more unconstrained domains such as conversational speech, we encounter a need to be able to segment (or resegment) waveforms and recognizer output into linguistically meaningful units such a sentences. Toward this end, we present a simple automatic segmenter of transcripts based on N-gram language modeling. We also study the relevance of several word-level features for segmentation performance. Using only word-level information, we achieve 85% recall and 70% precision on linguistic boundary detection
  • Keywords
    linguistics; natural languages; nomograms; speech recognition; N-gram language modeling; automatic linguistic segmentation; conversational speech; linguistic boundary detection; linguistically meaningful units; precision; recall; segmentation performance; speech recognition; speech recognizer output; transcript segmentation; unconstrained domains; waveform segmentation; word-level features; Acoustic signal detection; Acoustic waves; Automatic speech recognition; Decoding; Error analysis; Laboratories; Loudspeakers; Natural languages; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607773
  • Filename
    607773