• Title of article

    Dynamic packaging in e-retailing with stochastic demand over finite horizons: A Q-learning approach

  • Author/Authors

    Cheng، نويسنده , , Yan، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2009
  • Pages
    9
  • From page
    472
  • To page
    480
  • Abstract
    This paper investigates how intelligent an agent may utilize a Q-learning approach, a simulation-based stochastic technique, to make optimal dynamic packaging decision in e-retailing setting. When the practical application of dynamic packaging involves a large number of products, normal Q-learning approach would encounter two major problems due to excessively large state space. First, learning the Q-values in tabular form may be infeasible because of the excessive amount of memory needed to store the table. Second, rewards in the state space may be so sparse that with random exploration they will only be discovered extremely slowly. This paper first describes the state-dependent and event-driven nature of the dynamic packaging problem with a Markov decision process model, then proposes a states generalization approach based on distortion measure, and finally puts forward a heuristic based exploration/exploitation policy which is used to improve the convergence in Q-learning. We validate our approach in a simulated test.
  • Keywords
    Q-learning , Dynamic packaging , E-retailing
  • Journal title
    Expert Systems with Applications
  • Serial Year
    2009
  • Journal title
    Expert Systems with Applications
  • Record number

    2344961