Application of reinforcement learning in dynamic pricing algorithms

Author

Jintian, Wang ; Lei, Zhou

Author_Institution

Dept. of Comput. Sci. & Technol., Hefei Univ. of Technol., Hefei, China

fYear

2009

fDate

5-7 Aug. 2009

Firstpage

419

Lastpage

423

Abstract

This paper is concerned with the dynamic pricing problems of a duopoly case in electronic retail markets. Combined with the concept of performance potential, the simulated annealing Q-learning (SA-Q) and the win-or-learn-fast policy hill climbing algorithm (WoLF-PHC) are used to solve the learning problems of multi-agent systems with either average- or discounted-reward criteria, under the case that only partial information about the opponent is known. The simulation results show that the WoLF-PHC algorithm performs well in adapting environment´s change and in deriving better learning values than the SA-Q algorithm.

Keywords

learning (artificial intelligence); multi-agent systems; pricing; retailing; simulated annealing; WoLF-PHC algorithm; average-reward criteria; discounted-reward criteria; duopoly; dynamic pricing algorithm; electronic retail market; multiagent system; performance potential; reinforcement learning; simulated annealing Q-learning; win-or-learn-fast policy hill climbing algorithm; Application software; Automation; Computational modeling; Computer science; Consumer electronics; Heuristic algorithms; Learning; Logistics; Pricing; Simulated annealing; WoLF-PHC; multi-agent; performance potential; simulated annealing Q-learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Automation and Logistics, 2009. ICAL '09. IEEE International Conference on

Conference_Location

Shenyang

Print_ISBN

978-1-4244-4794-7

Electronic_ISBN

978-1-4244-4795-4

Type

conf

DOI

10.1109/ICAL.2009.5262885

Filename

5262885