• DocumentCode
    1290616
  • Title

    A stochastic model of human-machine interaction for learning dialog strategies

  • Author

    Levin, Esther ; Pieraccini, Roberto ; Eckert, Wieland

  • Author_Institution
    Shannon Lab., AT&T Bell Labs., Florham Park, NJ, USA
  • Volume
    8
  • Issue
    1
  • fYear
    2000
  • fDate
    1/1/2000 12:00:00 AM
  • Firstpage
    11
  • Lastpage
    23
  • Abstract
    We propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i.e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i.e., the MDP parameters that quantify the user´s behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups
  • Keywords
    Markov processes; decision theory; information systems; interactive systems; learning (artificial intelligence); natural language interfaces; optimisation; probability; speech recognition; travel industry; Markov decision process; air travel information system; cost assignment; dialog strategies; dialog systems; experiment; human-machine interaction; optimal strategy; optimization problem; quantitative model; reinforcement learning; sequential decision process; state space representation; state transition probabilities; stochastic model; supervised learning; Costs; Design optimization; Information systems; Man machine systems; State-space methods; Stochastic processes; Stochastic systems; Supervised learning; System testing; Training data;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.817450
  • Filename
    817450