Stochastic Resetting of Reinforcement Learning Agents

Jello Zhou; David Schwab; Wave Ngampruetikorn

Stochastic Resetting of Reinforcement Learning Agents

Oral-In-person

Abstract

Stochastic resetting -- the strategy of randomly restarting a search process, independent of the current state -- has been shown to improve mean first-passage times across a large set of physical and biological search processes. Here, we apply stochastic resetting to reinforcement learning (RL) agents, aiming to understand its effects on exploration efficiency and learning dynamics. To separate search from learning, we choose a finite simulation geometry where stochastic resetting does not improve first-passage times, finding that a finite resetting rate can still significantly accelerate learning by reducing the number of training steps required to reach optimal policies. Moreover, we demonstrate that even small nonzero resetting rates enhance learning efficiency compared to no resetting; further, this behaviour is robust across GridWorld Q-learning and TD(λ) for a wide range of λ. These findings suggest that stochastic resetting may be a broadly applicable tool for accelerating learning processes in both artificial and biological systems and point to potential avenues of further numerical and analytical investigation.

March 18, 2026, 2:12 PM – March 18, 2026, 2:24 PM

Presenters

Jello Zhou
- Stanford University

Authors

Jello Zhou
- Stanford University
David Schwab
- The Graduate Center, City University of New York
Wave Ngampruetikorn