Entropy Advantage in Neural Networks Generalizability

Entao Yang; Xiaotian Zhang; Yue Shang; Ge Zhang

Entropy Advantage in Neural Networks Generalizability

ORAL

Abstract

While neural networks have been widely used to assist physics research, leveraging physical principles to study neural networks has received much less attention. Inspired by statistical physics, we introduce the concept of entropy into neural networks by reconceptualizing them as hypothetical one-dimensional physical systems where each parameter is the coordinate of a "particle". We investigate the correlation between the physical systems' entropy and neural networks' generalizability on four distinct machine learning tasks. Our results suggest an Entropy Advantage, where the high-entropy states consistently outperform the states reached via classical training optimizers like stochastic gradient descent. Separation-of-variable studies have also been performed to evaluate the controlling factors of the entropy advantage.

^*This work used Brides-2 GPU at Pittsburgh Supercomputing Center through allocation CIS230096 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program.

March 18, 2025, 10:00 AM – March 18, 2025, 10:12 AM

Presenters

Entao Yang
- Air Liquide USA
- Air Liquide

Authors

Entao Yang
- Air Liquide USA
- Air Liquide
Xiaotian Zhang
- City University of Hong Kong
Yue Shang
- University of Pennsylvania
Ge Zhang
- City University of Hong Kong