Title :
Segmenting spoken language utterances into clauses for semantic classification
Author :
Gupta, Narendra K. ; Bangalore, Srinivas
Author_Institution :
AT&T Labs.-Res., USA
fDate :
30 Nov.-3 Dec. 2003
Abstract :
Robust spoken language understanding in large-scale conversational dialog applications is usually performed by classification of the user utterances into one or many semantic classes. The features used for classification are sensitive to variations caused by artifacts of spoken language, such as edits, repairs and other dysfluencies. Furthermore, the performance of these classifiers typically degrades when the user´s utterance contains multiple semantic classes. In this paper, we present a semantic classification technique that first automatically removes dysfluencies and segments the user´s utterance into clauses and then classifies the utterance based on the classification of the clauses. We show that this preprocessing improves the semantic classification accuracy for utterances and significantly decreases the amount of training data needed for a given classification accuracy level.
Keywords :
interactive systems; pattern classification; speech intelligibility; speech processing; speech recognition; automatic dysfluency removal; clauses; large-scale conversational dialog applications; performance; robust spoken language understanding; semantic classification accuracy; spoken language artifacts; user utterance segmentation; Degradation; Design methodology; Large-scale systems; Natural languages; Robustness; Speech analysis; Speech recognition; Text categorization; Training data;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318495