Title :
A method for evaluating and comparing user simulations: The Cramér-von Mises divergence
Author :
Williams, Jason D.
Author_Institution :
AT&T Labs -Res., Florham Park
Abstract :
Although user simulations are increasingly employed in the development and assessment of spoken dialog systems, there is no accepted method for evaluating user simulations. In this paper, we propose a novel quality measure for user simulations. We view a user simulation as a predictor of the performance of a dialog system, where per-dialog performance is measured with a domain-specific scoring function. The quality of the user simulation is measured as the divergence between the distribution of scores in real dialogs and simulated dialogs, and we argue that the Cramer-von Mises divergence is well-suited to this task. The technique is demonstrated on a corpus of real calls, and we present a table of critical values for practitioners to interpret the statistical significance of comparisons between user simulations.
Keywords :
interactive systems; speech recognition; statistical analysis; user modelling; Cramer-von Mises divergence; domain-specific scoring function; quality measure; spoken dialog system; statistical analysis; user simulation; Algorithm design and analysis; Computational modeling; Design optimization; Hidden Markov models; Humans; Laboratories; Machine learning; Machine learning algorithms; Predictive models; Speech recognition; User simulation; dialog management; dialog simulation; user modelling;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
DOI :
10.1109/ASRU.2007.4430164