Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of British Columbia, Vancouver, BC, Canada
Abstract :
We consider how local and global decision policies interact in stopping time problems such as quickest time change detection. Individual agents make myopic local decisions via social learning, that is, each agent records a private observation of a noisy underlying state process, selfishly optimizes its local utility and then broadcasts its local decision. Given these local decisions, how can a global decision maker achieve quickest time change detection when the underlying state changes according to a phase-type distribution? This paper presents four results. First, using Blackwell dominance of measures, it is shown that the optimal cost incurred in social-learning-based quickest detection is always larger than that of classical quickest detection. Second, it is shown that in general the optimal decision policy for social-learning-based quickest detection is characterized by multiple thresholds within the space of Bayesian distributions. Third, using lattice programming and stochastic dominance, sufficient conditions are given for the optimal decision policy to consist of a single linear hyperplane, or, more generally, a threshold curve. Estimation of the optimal linear approximation to this threshold curve is formulated as a simulation-based stochastic optimization problem. Finally, this paper shows that in multiagent sensor management with quickest detection, where each agent views the world according to its prior, the optimal policy has a similar structure to social learning.
Keywords :
decision making; learning (artificial intelligence); statistical distributions; stochastic programming; Bayesian distribution; Blackwell dominance of measures; global decision makers; lattice programming; local decision makers; local utility; multiagent sensor management; myopic local decisions; optimal cost; optimal decision policy; optimal linear approximation; phase-type distribution; quickest detection POMDP; quickest time change detection; single linear hyperplane; social learning; stochastic dominance; stochastic optimization problem; sufficient conditions; threshold curve; Bayesian methods; Decision making; Multiagent systems; Organizations; Protocols; Stochastic processes; Adaptive sensing; Blackwell dominance; multiagent sensor scheduling; partially observed Markov decision process (POMDP); phase-type distribution; quickest time Bayesian change detection; social learning; stochastic dominance;