Geometrical structure of Neural Networks: Geodesics, Jeffrey&#039;s Prior and Hyper-ribbons

Lorien Hayden; Alexander Alemi; James Sethna

Geometrical structure of Neural Networks: Geodesics, Jeffrey's Prior and Hyper-ribbons

ORAL

Abstract

Neural networks are learning algorithms which are employed in a host of Machine Learning problems including speech recognition, object classification and data mining. In practice, neural networks learn a low dimensional representation of high dimensional data and define a model manifold which is an embedding of this low dimensional structure in the higher dimensional space. In this work, we explore the geometrical structure of a neural network model manifold. A Stacked Denoising Autoencoder and a Deep Belief Network are trained on handwritten digits from the MNIST database. Construction of geodesics along the surface and of slices taken from the high dimensional manifolds reveal a hierarchy of widths corresponding to a hyper-ribbon structure. This property indicates that neural networks fall into the class of sloppy models, in which certain parameter combinations dominate the behavior. Employing this information could prove valuable in designing both neural network architectures and training algorithms.

March 7, 2014, 3:15 PM – March 7, 2014, 3:27 PM

Authors

Lorien Hayden

Cornell University
Alexander Alemi

Cornell University
James Sethna

Cornell University