Compression in Neural Networks via Weight Coupling

Daniel Bernstein; Luca Di Carlo; David Schwab

Compression in Neural Networks via Weight Coupling

Oral-In-person

Abstract

One of the most popular methods for the compression of machine learning models is weight quantization, which reduces the precision of individual network weights rather than the number of weights itself. We explore the efficacy of adding a pairwise, attractive coupling between weights in order to encourage weight clustering and quantizability. We find that the addition of such a term together with the task loss rapidly drives a pre-trained network to a quantized configuration with minimal impact on generalization. Our implementation is computationally efficient, and it yields mixed-precision models while relying on a small number of intuitive hyperparameters. We show that the "quantizability" of models via weight couplings has interesting implications for the loss-landscape geometry of problems in machine learning.

March 18, 2026, 8:12 AM – March 18, 2026, 8:24 AM

Publication: Quantization and the Bottom of the Loss Landscape (ICML workshop paper); Neural Network Quantization via Weight-Weight Coupling (planned paper)

Presenters

Daniel Bernstein
- Princeton University

Authors

Daniel Bernstein
- Princeton University
Luca Di Carlo
David Schwab
- The Graduate Center, City University of New York