Statistical physics of regression with quadratic models
ORAL
Abstract
A central model in machine learning for theory and practice is the linear model, where the predictor is a linear function of the learnable parameters. Such models arise naturally in a common limit of infinitely-wide deep neural networks and have aided in the understanding of the dynamics of learning and generalization. However, linear models also have limitations and do not capture the richness of "feature learning" that arises in deep neural networks. We consider quadratic models – predictors which allow a quadratic dependence on parameters – as a class of models for studying the effects of feature learning. We theoretically investigate the generalization scaling with sample size and learning dynamics in these models via replica methods from statistical physics and a dynamical mean field theory.
* This work was supported NSF Award DMS-2134157 and a Google PhD research fellowship.
–
Presenters
-
Blake Bordelon
Harvard University
Authors
-
Blake Bordelon
Harvard University
-
Cengiz Pehlevan
Harvard University
-
Yasaman Bahri
Google LLC