Statistical Mechanics of Function Vectors

Ravin Raj; Gautam Reddy

Statistical Mechanics of Function Vectors

Oral-In-person

Abstract

Transformer-based large language models demonstrate the phenomenon of in-context learning (ICL) - the ability to learn from examples presented in the provided context without any weight updates to the model. Mechanistic analysis of large meta-trained models has revealed two dominant mechanisms for ICL: multi-layer attention-based circuits that implement specific computations, and function vectors, which encode an emergent 'internal state' that when applied to a new input produces an appropriate response. We develop a framework for function-vector-based ICL that exploits a connection between dense associative memories and maximum entropy (MaxEnt) methods from statistical mechanics. The model reduces to a standard linear attention model for a specific choice of the reference distribution, and the framework offers a precise interpretation of a function vector as an internal state that specifies a prior over functions. Ongoing work seeks to generalize this framework through a sequence of information projections on the manifold of distributions.

March 18, 2026, 12:00 PM – March 18, 2026, 12:12 PM

Presenters

Ravin Raj
- Princeton University

Authors

Ravin Raj
- Princeton University
Gautam Reddy
- Princeton University