Title :
On exploiting symmetry for multilayer perceptron learning
Author :
Mizutani, Eiji ; Fan, Jing-Yun Carey
Author_Institution :
Nat. Taiwan Univ. of Sci. & Technol., Taipei
Abstract :
The classical S-bit parity two-class pattern classification problem can be solved by a multilayer perceptron (MLP) with only H = [(B+1)/2] hidden nodes. By our "symmetric" arguments, we show the following: (1) H hidden nodes may consist of one "linear" and (H-1) binary-valued step-functions (i.e., McCulloch-Pitts neurons) for a facile solution (of integer-valued weights); (2) the posed model can be transformed to a well-known triangular-shaped network with (H-1) hidden nodes having direct connections from B inputs to the terminal node; (3) "weight-sharing" simplifies the structure and thus makes it easier to find a solution; (4) it is possible to find in O(H) an optimal set of weights for an MLP with H "tanh" hidden nodes; and (5) a new scheme is designed to supply desired outputs to at most (H-1) hidden nodes for developing insensitivity to initial weights. These findings concerning how sigmoid hidden nodes get saturated for solution are closely related to plateau phenomena that often occur in MLP-learning on parity. Since the phenomena are related to the saddle-point issue, we investigated the indefiniteness of the Hessian matrix H of the sum-squared-error measure: When H is indefinite, a learning algorithm that exploits negative curvature can efficiently find a separating hyperplane by moving away from the nearest saddle point.
Keywords :
Hessian matrices; learning (artificial intelligence); multilayer perceptrons; pattern classification; Hessian matrix; McCulloch-Pitts neurons; binary-valued step-functions; classical S-bit parity two-class pattern classification problem; integer-valued weights; multilayer perceptron learning; plateau phenomena; saddle-point issue; sigmoid hidden nodes; sum-squared-error measure; symmetric arguments; triangular-shaped network; Cost function; Gradient methods; Graphics; Least squares methods; Multi-layer neural network; Multilayer perceptrons; Neural networks; Neurons; Pattern classification; Space technology;
Conference_Titel :
Neural Networks, 2007. IJCNN 2007. International Joint Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-4244-1379-9
Electronic_ISBN :
1098-7576
DOI :
10.1109/IJCNN.2007.4371413