مرکز منطقه ای اطلاع رساني علوم و فناوري - Scaling of back-propagation training time to large dimensions

Abstract :

The training time for the back-propagation neural network algorithm is studied as a function of input dimension for the problem of discriminating between two overlapping multidimensional Gaussian distributions. This problem is simple enough (it is linearly separable for distributions which are not centered at the same point) to allow an analytic determination of the expected performance, yet it is realistic in the sense that many real-world problems have distributions of discriminants which are approximately Gaussian. The simulations are carried out for input dimensions ranging from 1 to 1000 and show that, for large enough N, the training time scales linearly with input dimension, N, when a constant error criterion is used to determine when to terminate training. The slope of this linear dependence is a function of the error criterion and the ratio of the standard deviation to separation of the two Gaussian distributions. The closer the separation, the longer is the required training time. For each input dimension, a full statistical treatment was implemented by training the network 400 times, with a different random initialization of weights and biases each time. These results provide insight into the ultimate limitations of a straightforward implementation of back-propagation