Compression in Neural Networks via Weight Coupling

ORAL

Abstract

One of the most popular methods for the compression of machine learning models is weight quantization, which reduces the precision of individual network weights rather than the number of weights itself. We explore the efficacy of adding a pairwise, attractive coupling between weights in order to encourage weight clustering and quantizability. We find that the addition of such a term together with the task loss rapidly drives a pre-trained network to a quantized configuration with minimal impact on generalization. Our implementation is computationally efficient, and it yields mixed-precision models while relying on a small number of intuitive hyperparameters. We show that the "quantizability" of models via weight couplings has interesting implications for the loss-landscape geometry of problems in machine learning.

*Princeton Center for the Physics of Biological Function

Publication: Quantization and the Bottom of the Loss Landscape (ICML workshop paper); Neural Network Quantization via Weight-Weight Coupling (planned paper)

Presenters

  • Daniel T Bernstein

    • Princeton University

Authors

  • Daniel T Bernstein

    • Princeton University
  • Luca Di Carlo

    • Princeton University
  • David J Schwab

    • The Graduate Center, City University of New York