مرکز منطقه ای اطلاع رساني علوم و فناوري - Understanding Deep Networks with Gradients

Abstract :

Existing methods for understanding the inner workings of convolutional neural networks have relied on visualizations, which do not describe the connections between the layers and units of the network. We introduce the prediction gradient as a measure of a neuron´s relevance to prediction. Using this quantity, we study a relatively small convolutional neural network and make three observations. First, there exists a small number of high prediction-gradient units, which upon removal, severely impact the ability of the network to classify correctly. Second, this performance loss generalizes spans multiple classes, and is not mirrored by removing low-gradient units. Third, the distributed representation of the neural network prevents performance from being impacted until a critical number of units are destroyed, the number depending highly on the prediction gradient of the units removed. These three observations validate the utility of the prediction gradient in identifying important units in a neural network. We finally use the prediction gradient in order to generate and study adversarial examples.