Interpreting Transformers through the Lens of Physics and Dynamical Systems

Tirtho Roy; Ushashi Bhattacharjee; Adrito Roy

Interpreting Transformers through the Lens of Physics and Dynamical Systems

Poster-In-person

Abstract

Large Language Models (LLMs) such as Transformers have demonstrated remarkable linguistic and reasoning capabilities, yet their internal dynamics remain poorly understood. This work reviews recent efforts to interpret LLMs through the lens of physics and nonlinear dynamical systems. We explore how hidden states evolve as trajectories in high-dimensional phase space, governed by layer-wise transformations analogous to physical time evolution. Attention mechanisms are discussed as nonlocal field interactions, while loss landscapes resemble energy surfaces minimizing free energy. Concepts from thermodynamics—such as entropy, temperature, and criticality—offer insights into emergent behaviors, scaling laws, and phase transitions in model performance. By integrating principles from statistical mechanics, information geometry, and dynamical systems theory, this review aims to establish a unified framework for understanding language models as physical systems. Such perspectives may guide future research in interpretability, robustness, and efficient training dynamics.

March 17, 2026, 2:00 PM – March 17, 2026, 2:00 PM

· 157

Presenters

Tirtho Roy
- Iowa State University

Authors

Tirtho Roy
- Iowa State University
Ushashi Bhattacharjee
- Iowa State University
Adrito Roy
- Barishal Zilla School