Universality of LLM mechanisms across scale and diversity

Lindsay Smith; Gautam Reddy; David Schwab

Universality of LLM mechanisms across scale and diversity

Oral-In-person

Abstract

Large Language Models (LLMs) have become ubiquitous both in their use in everyday life and as a subject of scientific experiments and theory regarding their capabilities and learning mechanisms. However, it remains to be shown if the mechanisms of LLMs are universal, both across data diversity and model scale and initialization. We train our own 1.7B parameter LLM on state-of-the-art data used in performant models and also train variants that differ only by random seed of model initialization, data ordering, and different subsets of data. We show with information theoretic metrics that these different model variants have different output distributions, but over training time become more similar. Our results quantify when task-specific abilities emerge in training, examine the universality of these abilities, and contribute to reproducibility in AI research.

March 18, 2026, 12:24 PM – March 18, 2026, 12:36 PM

Presenters

Lindsay Smith
- Princeton University

Authors

Lindsay Smith
- Princeton University
Gautam Reddy
- Princeton University
David Schwab
- The Graduate Center, City University of New York