Statistical physics of regression with quadratic models

ORAL

Abstract

A central model in machine learning for theory and practice is the linear model, where the predictor is a linear function of the learnable parameters. Such models arise naturally in a common limit of infinitely-wide deep neural networks and have aided in the understanding of the dynamics of learning and generalization. However, linear models also have limitations and do not capture the richness of "feature learning" that arises in deep neural networks. We consider quadratic models – predictors which allow a quadratic dependence on parameters – as a class of models for studying the effects of feature learning. We theoretically investigate the generalization scaling with sample size and learning dynamics in these models via replica methods from statistical physics and a dynamical mean field theory.

* This work was supported NSF Award DMS-2134157 and a Google PhD research fellowship.

Presenters

  • Blake Bordelon

    Harvard University

Authors

  • Blake Bordelon

    Harvard University

  • Cengiz Pehlevan

    Harvard University

  • Yasaman Bahri

    Google LLC