Sparse spectra in learned representations of symmetries
ORAL
Abstract
Some symmetries are encoded into neural networks by hand, while others are learned from regularities in the training data. Recent work has found that, when language models are asked to learn integer addition modulo p, they eventually converge on a representation which encodes the symmetry of the problem. This representation emerges long after memorisation of the training examples, and allows perfect generalisation to novel examples – a phenomenon dubbed "grokking". The learned representation is a discrete Fourier encoding of the input and output integers, but curiously it uses only a sparse subset of the possible frequencies. Here we show that some much simpler networks reproduce this sparse behaviour, allowing a more detailed analytic understanding. And we use information theory to study why this sparsity emerges, and explore its implications for robustness of the learned behaviour.
–
Presenters
-
Michael C Abbott
Yale University
Authors
-
Michael C Abbott
Yale University
-
Benjamin B Machta
Yale University