• DocumentCode
    312331
  • Title

    Word predictability after hesitations: a corpus-based study

  • Author

    Shriberg, Elizabeth ; Stolcke, Andreas

  • Author_Institution
    Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
  • Volume
    3
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    1868
  • Abstract
    Asks whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the context of an N-gram language model. The results show that the transition probability is significantly lower at hesitation transitions, and that this is attributable to both the following word and the word history. In addition, the results suggest that fluent transitions in sentences with a hesitation elsewhere are significantly more likely to contain out-of-vocabulary words and novel word combinations. Such findings could be used to improve statistical language modeling for spontaneous speech applications
  • Keywords
    entropy; linguistics; nomograms; probability; psychology; speech; N-gram language model; corpus-based study; entropy; fluent transitions; following word; hesitation transitions; lexical hesitations; novel word combinations; out-of-vocabulary words; sentences; spontaneous speech; statistical language modeling; transition probability; word history; word predictability; Context modeling; Entropy; History; Humans; Laboratories; Natural languages; Predictive models; Probability; Speech; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607996
  • Filename
    607996