In-depth analysis of the learning process for a small artificial neural network
ORAL
Abstract
Machine learning and artificial neural networks are among the most rapidly advancing tools in many fields, including physics. Neural networks have already proven to be valuable optimization methods in numerous scientific applications. Although the potential hidden inside these network architectures is tremendous, to unleash the full potential a thorough understanding of the deep-learning mechanism used to train the networks is necessary. In our study, we investigate the loss landscape and backpropagation dynamics of convergence for the logical exclusive-OR (XOR) gate by means of one of the simplest artificial neural networks composed of sigmoid neurons. Various optimal parameter sets of weights and biases that enable the correct logical mapping from the input neurons via a single layer with two hidden neurons to the output neuron are identified. The state space of the neural network is expressed by a nine-dimensional loss landscape, but three-dimensional cross sections already exhibit distinct features such as plateaus and channels. Our analysis of the learning process helps to understand why backpropagation efficiently achieves convergence toward zero loss, whereas values of weights and biases keep drifting.
–
Publication: X. Yang, K. Arora, and M. Bachmann, "Dissecting a Small Artificial Neural Network", preprint (2023).
Presenters
-
Xiguang Yang
University of Georgia
Authors
-
Xiguang Yang
University of Georgia
-
Krish Arora
University of Georgia
-
Michael Bachmann
University of Georgia