مرکز منطقه ای اطلاع رساني علوم و فناوري - Factorized decision forecasting via combining value-based and reward-based estimation

DocumentCode :

2887157

Title :

Factorized decision forecasting via combining value-based and reward-based estimation

Author :

Ziebart, Brian D.

Author_Institution :

Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear :

2011

fDate :

28-30 Sept. 2011

Firstpage :

966

Lastpage :

973

Abstract :

A powerful recent perspective for predicting sequential decisions learns the parameters of decision problems that produce observed behavior as (near) optimal solutions. Under this perspective, behavior is explained in terms of utilities, which can often be defined as functions of state and action features to enable generalization across decision tasks. Two approaches have been proposed from this perspective: estimate a feature-based reward function and recursively compute values from it, or directly estimate a feature-based value function. In this work, we investigate the combination of these two approaches into a single learning task using directed information theory and the principle of maximum entropy. This enables uncovering which type of estimate is most appropriate-in terms of predictive accuracy and/or computational benefit-for different portions of the decision space.

Keywords :

decision theory; forecasting theory; learning (artificial intelligence); maximum entropy methods; action features; decision problems; directed information theory; factorized decision forecasting; feature-based reward function; feature-based value function; learning task; maximum entropy principle; reward-based estimation; sequential decision prediction; state features; value-based estimation; Entropy; Equations; Estimation; Mathematical model; Optimal control; Optimization; Strontium;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on

Conference_Location :

Monticello, IL

Print_ISBN :

978-1-4577-1817-5

Type :

conf

DOI :

10.1109/Allerton.2011.6120271

Filename :

6120271

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2887157