Scaling Laws and Emergent Behaviors in Foundation Models

ORAL · Invited

Abstract

Large-scale unsupervised pre-trained models, a.k.a. "Foundation models", are taking the AI field by storm, achieving state-of-art performance and impressive few-shot generalization abilities on a variety of tasks in multiple domains. Clearly, predicting the performance and other metrics of interest (robustness, truthfulness etc) at scale, including potential emergent behaviors, is crucial for (1) choosing learning methods that are likely to stand the test-of-time as larger compute becomes available, and (2) ensuring safe behavior of AI systems via anticipating potential emergent behaviors ("phase transitions"). We investigate both an "open-box" approach, when the access to learning dynamics and internal metrics of a neural network are available (e.g., in the case of "grokking" behavior), as well as "closed-box" approach where the predictions of future behavior must be made solely based on the previous behavior, without internal measurements of the system being available.

Presenters

  • Irina Rish

    MILA

Authors

  • Irina Rish

    MILA