DocumentCode :
1081965
Title :
On Expediency and Convergence in Variable-Structure Automata
Author :
Chandrasekaran, B. ; Shen, David W C
Author_Institution :
Moore School of Electrical Engineering, University of Pennsylvania, Philadelphia. now with the Philco-Ford Corporation, Blue Bell, Pa. 19422
Volume :
4
Issue :
1
fYear :
1968
fDate :
3/1/1968 12:00:00 AM
Firstpage :
52
Lastpage :
60
Abstract :
A stochastic automaton responds to the penalties from a random environment through a reinforcement scheme by changing its state probability distribution in such a way as to reduce the average penalty received. In this manner the automaton is said to possess a variable structure and the ability to learn. This paper discusses the efficiency of learning for an m-state automaton in terms of expediency and convergence, under two distinct types of reinforcement schemes: one based on penalty probabilities and the other on penalty strengths. The functional relationship between the successive probabilities in the reinforcement scheme may be either linear or nonlinear. The stability of the asymptotic expected values of the state probability is discussed in detail. The conditions for optimal and expedient behavior of the automaton are derived. Reduction of the probability of suboptimal performance by adopting the Beta model of the mathematical learning theory is discussed. Convergence is discussed in the light of variance analysis. The initial learning rate is used as a measure of the overall convergence rate. Learning curves can be obtained by solving nonlinear difference equations relating the successive expected values. An analytic expression concerning the convergence behavior of the linear case is derived. It is shown that by a suitable choice of the reinforcement scheme it is possible to increase the separation of asymptotic state probabilities.
Keywords :
Analysis of variance; Asymptotic stability; Convergence; Difference equations; Learning automata; Learning systems; Mathematical model; Probability distribution; Stochastic processes; Stochastic systems;
fLanguage :
English
Journal_Title :
Systems Science and Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
0536-1567
Type :
jour
DOI :
10.1109/TSSC.1968.300188
Filename :
4082117
Link To Document :
بازگشت