Entropy Advantage in Neural Networks Generalizability
ORAL
Abstract
While neural networks have been widely used to assist physics research, leveraging physical principles to study neural networks has received much less attention. Inspired by statistical physics, we introduce the concept of entropy into neural networks by reconceptualizing them as hypothetical one-dimensional physical systems where each parameter is the coordinate of a "particle". We investigate the correlation between the physical systems' entropy and neural networks' generalizability on four distinct machine learning tasks. Our results suggest an Entropy Advantage, where the high-entropy states consistently outperform the states reached via classical training optimizers like stochastic gradient descent. Separation-of-variable studies have also been performed to evaluate the controlling factors of the entropy advantage.
*This work used Brides-2 GPU at Pittsburgh Supercomputing Center through allocation CIS230096 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program.
–
Presenters
-
Entao Yang
- Air Liquide USA
- Air Liquide