DocumentCode :
381278
Title :
Investigating stochastic speech understanding
Author :
Bonneau-Maynard, Héléne ; Lefevre, Francois
Author_Institution :
Lab. d´´Informatique pour la Mecanique et les Sci. de l´´Ingenieur, CNRS, Orsay, France
fYear :
2001
fDate :
2001
Firstpage :
260
Lastpage :
263
Abstract :
The need for human expertise in the development of a speech understanding system can be greatly reduced by the use of stochastic techniques. However corpus-based techniques require the annotation of large amounts of training data. Manual semantic annotation of such corpora is tedious, expensive, and subject to inconsistencies. This work investigates the influence of the training corpus size on the performance of the understanding module. The use of automatically annotated data is also investigated as a means to increase the corpus size at a very low cost. First, a stochastic speech understanding model developed using data collected with the LIMSI ARISE dialog system is presented. Its performance is shown to be comparable to that of the rule-based caseframe grammar currently used in the system. In a second step, two ways of reducing the development cost are pursued: (1) reducing of the amount of manually annotated data used to train the stochastic models and (2) using automatically annotated data in the training process.
Keywords :
interactive systems; natural language interfaces; speech recognition; speech-based user interfaces; stochastic processes; LIMSI ARISE dialog system; automatically annotated data; development cost reduction; performance; speech understanding system; training corpus size; Costs; Data mining; Humans; Natural languages; Performance evaluation; Speech analysis; Stochastic processes; Stochastic systems; Telephony; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
Print_ISBN :
0-7803-7343-X
Type :
conf
DOI :
10.1109/ASRU.2001.1034637
Filename :
1034637
Link To Document :
بازگشت