Mutual Information Estimators for Optimal Joint Embeddings

Paarth Gulati; Sean Ridout; Ilya Nemenman

Mutual Information Estimators for Optimal Joint Embeddings

Oral-In-person

Abstract

Contrastive objectives used in modern joint embedding architectures can be reinterpreted as variational estimators of mutual information, which reveals an underlying information-bottleneck framework for representation learning. Leveraging this framework, we show that under a broad range of conditions, embeddings of high-dimensional datasets exhibit jointly Gaussian statistics, and the existing methods are optimal and accurately capture the mutual information. For datasets with structured non-Gaussian latent variables, we design optimal architectures which use substantially fewer samples than existing methods. Our framework naturally generalizes to multi-view datasets with more than two modalities, offering a path toward faster, data-efficient model discovery in physical and biological systems from limited and noisy experimental data.

March 18, 2026, 5:18 PM – March 18, 2026, 5:30 PM

Presenters

Paarth Gulati
- Emory University

Authors

Paarth Gulati
- Emory University
Sean Ridout
- Emory University
Ilya Nemenman
- Emory University