Title : 
Noise benefits in backpropagation and deep bidirectional pre-training
         
        
            Author : 
Audhkhasi, Kartik ; Osoba, Osonde ; Kosko, B.
         
        
            Author_Institution : 
Electr. Eng. Dept., Univ. of Southern California, Los Angeles, CA, USA
         
        
        
        
        
        
            Abstract : 
We prove that noise can speed convergence in the backpropagation algorithm. The proof consists of two separate results. The first result proves that the backpropagation algorithm is a special case of the generalized Expectation-Maximization (EM) algorithm for iterative maximum likelihood estimation. The second result uses the recent EM noise benefit to derive a sufficient condition for backpropagation training. The noise adds directly to the training data. A noise benefit also applies to the deep bidirectional pre-training of the neural network as well as to the backpropagation training of the network. The geometry of the noise benefit depends on the probability structure of the neurons at each layer. Logistic sigmoidal neurons produce a forbidden noise region that lies below a hyperplane. Then all noise on or above the hyperplane can only speed convergence of the neural network. The forbidden noise region is a sphere if the neurons have a Gaussian signal or activation function. These noise benefits all follow from the general noise benefit of the EM algorithm. Monte Carlo sample means estimate the population expectations in the EM algorithm. We demonstrate the noise benefits using MNIST digit classification.
         
        
            Keywords : 
Gaussian processes; Monte Carlo methods; backpropagation; convergence; expectation-maximisation algorithm; geometry; neural nets; signal processing; transfer functions; EM algorithm; EM noise benefit; Gaussian signal; MNIST digit classification; Monte Carlo sample; activation function; backpropagation algorithm; backpropagation training; deep bidirectional pretraining; forbidden noise region; generalized expectation-maximization algorithm; geometry; hyperplane; iterative maximum likelihood estimation; logistic sigmoidal neurons; neural network; population expectations; probability structure; speed convergence; Backpropagation; Biological neural networks; Convergence; Maximum likelihood estimation; Neurons; Noise; Training; Backpropagation; Expectation-Maximization algorithm; bidirectional associative memory; neural network; noise benefit; stochastic resonance;
         
        
        
        
            Conference_Titel : 
Neural Networks (IJCNN), The 2013 International Joint Conference on
         
        
            Conference_Location : 
Dallas, TX
         
        
        
            Print_ISBN : 
978-1-4673-6128-6
         
        
        
            DOI : 
10.1109/IJCNN.2013.6707022