Stochastic Resetting of Reinforcement Learning Agents

ORAL

Abstract

Stochastic resetting -- the strategy of randomly restarting a search process, independent of the current state -- has been shown to improve mean first-passage times across a large set of physical and biological search processes. Here, we apply stochastic resetting to reinforcement learning (RL) agents, aiming to understand its effects on exploration efficiency and learning dynamics. To separate search from learning, we choose a finite simulation geometry where stochastic resetting does not improve first-passage times, finding that a finite resetting rate can still significantly accelerate learning by reducing the number of training steps required to reach optimal policies. Moreover, we demonstrate that even small nonzero resetting rates enhance learning efficiency compared to no resetting; further, this behaviour is robust across GridWorld Q-learning and TD(λ) for a wide range of λ. These findings suggest that stochastic resetting may be a broadly applicable tool for accelerating learning processes in both artificial and biological systems and point to potential avenues of further numerical and analytical investigation.

*DJS acknowledges support from a Simons Fellowship in the MMLS and a Sloan Fellowship in Physics. 

Presenters

  • Jello Zhou

    • Stanford University

Authors

  • Jello Zhou

    • Stanford University
  • David J Schwab

    • The Graduate Center, City University of New York
  • Wave Ngampruetikorn

    • University of Sydney