Title :
Reinforcement learning aided smart-home decision-making in an interactive smart grid
Author :
Ding Li ; Jayaweera, Sudharman K.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of New Mexico, Albuquerque, NM, USA
Abstract :
In this paper, a complete hierarchical architecture is presented for the Utility-customer interaction, which tightly connect several important research topics, such as customer load prediction, renewable generation integration, power-load balancing and demand response. The complete interaction cycle consists of two stages: (1) Initial interaction (long-term planning) and (2) Real-time interaction (short-term planning). A hidden mode Markov decision process (HM-MDP) model is developed for customer real-time decision making, which outperforms the conventional Markov decision process (MDP) model in handling the non-stationary environment. To obtain a low-complexity, real-time algorithm, that allows to adaptively incorporate new observations as the environment changes, we resort to Q-learning based approximate dynamic programming (ADP). Without requiring specific starting and ending points of the scheduling period, the Q-learning algorithm offers more flexibility in practice. Performance analysis of both exact and approximate algorithms are presented with simulation results, in comparison with other decision making strategies.
Keywords :
Markov processes; approximation theory; decision making; dynamic programming; game theory; home automation; learning (artificial intelligence); power engineering computing; power system planning; smart power grids; ADP; HM-MDP model; Q-learning based approximate dynamic programming; complete hierarchical architecture; customer load prediction; customer real-time decision making; demand response; hidden mode Markov decision process model; interactive smart grid; long-term planning; power-load balancing; reinforcement learning aided smart-home decision-making; renewable generation integration; short-term planning; utility-customer interaction; Approximation algorithms; Decision making; Heuristic algorithms; Hidden Markov models; Markov processes; Real-time systems; Scheduling; Baum-Welch algorithm; Q-learning; Smart-home; approximate dynamic programming (ADP); hidden mode Markov decision process (HM-MDP);
Conference_Titel :
Green Energy and Systems Conference (IGESC), 2014 IEEE
Conference_Location :
Long Beach, CA
DOI :
10.1109/IGESC.2014.7018632