DocumentCode
312331
Title
Word predictability after hesitations: a corpus-based study
Author
Shriberg, Elizabeth ; Stolcke, Andreas
Author_Institution
Speech Technol. & Res. Lab., SRI Int., Menlo Park, CA, USA
Volume
3
fYear
1996
fDate
3-6 Oct 1996
Firstpage
1868
Abstract
Asks whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the context of an N-gram language model. The results show that the transition probability is significantly lower at hesitation transitions, and that this is attributable to both the following word and the word history. In addition, the results suggest that fluent transitions in sentences with a hesitation elsewhere are significantly more likely to contain out-of-vocabulary words and novel word combinations. Such findings could be used to improve statistical language modeling for spontaneous speech applications
Keywords
entropy; linguistics; nomograms; probability; psychology; speech; N-gram language model; corpus-based study; entropy; fluent transitions; following word; hesitation transitions; lexical hesitations; novel word combinations; out-of-vocabulary words; sentences; spontaneous speech; statistical language modeling; transition probability; word history; word predictability; Context modeling; Entropy; History; Humans; Laboratories; Natural languages; Predictive models; Probability; Speech; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.607996
Filename
607996
Link To Document