Author_Institution :
Sch. of Electr., Univ. of Wollongong, NSW, Australia
Abstract :
This article presents some efficient training algorithms, based on first-order, second-order, and conjugate gradient optimization methods, for a class of convolutional neural networks (CoNNs), known as shunting inhibitory convolution neural networks. Furthermore, a new hybrid method is proposed, which is derived from the principles of Quickprop, Rprop, SuperSAB, and least squares (LS). Experimental results show that the new hybrid method can perform as well as the Levenberg-Marquardt (LM) algorithm, but at a much lower computational cost and less memory storage. For comparison sake, the visual pattern recognition task of face/nonface discrimination is chosen as a classification problem to evaluate the performance of the training algorithms. Sixteen training algorithms are implemented for the three different variants of the proposed CoNN architecture: binary-, Toeplitz- and fully connected architectures. All implemented algorithms can train the three network architectures successfully, but their convergence speed vary markedly. In particular, the combination of LS with the new hybrid method and LS with the LM method achieve the best convergence rates in terms of number of training epochs. In addition, the classification accuracies of all three architectures are assessed using ten-fold cross validation. The results show that the binary- and Toeplitz-connected architectures outperform slightly the fully connected architecture: the lowest error rates across all training algorithms are 1.95% for Toeplitz-connected, 2.10% for the binary-connected, and 2.20% for the fully connected network. In general, the modified Broyden-Fletcher-Goldfarb-Shanno (BFGS) methods, the three variants of LM algorithm, and the new hybrid/LS method perform consistently well, achieving error rates of less than 3% averaged across all three architectures.
Keywords :
convolution; image processing; learning (artificial intelligence); least squares approximations; neural net architecture; pattern recognition; Toeplitz connected architecture; binary connected architecture; face/nonface discrimination; image processing; least square method; shunting inhibitory convolutional neural network; training algorithms; visual pattern recognition; Artificial neural networks; Biological neural networks; Convergence; Convolution; Error analysis; Least squares methods; Neural networks; Neurons; Optimization methods; Pattern recognition; Convolutional neural network (CoNN); first- and second-order training methods; shunting inhibitory neuron; Algorithms; Computer Simulation; Models, Statistical; Neural Inhibition; Neural Networks (Computer); Numerical Analysis, Computer-Assisted; Pattern Recognition, Automated; Regression Analysis; Signal Processing, Computer-Assisted; Stochastic Processes;