An Extension of Entropy-Regularized Reinforcement Learning to Reducible Dynamics

POSTER

Abstract

Entropy-regularized reinforcement learning (ER-RL) effectively promotes both robustness and exploration in environments with stochastic dynamics by incorporating an entropic penalty term into the objective function. Recent work has shown how the objective in ER-RL can be optimized using approaches and results from non-equilibrium statistical mechanics (NESM).

However, this mapping relies on the Perron-Frobenius theorem, which applies in the case of Markov decision processes (MDPs) with irreducible transition matrices. For MDPs with reducible dynamics, we cannot invoke the Perron-Frobenius theorem, and correspondingly it is not clear if approaches derived from NESM can be applied to find optimal policies in ER-RL. To address these challenges, we propose a theoretical framework for extending ER-RL to environments with reducible dynamics. Our approach modifies the twisted generator of MDPs while preserving key insights from the Perron-Frobenius theorem. Through both theoretical analysis and extensive empirical evaluation, we show that our method leads to analytical results for optimal policies in ER-RL for systems with reducible dynamics. This work opens new avenues for addressing several important issues in ER-RL ranging from insights into the role of discounting to strategies for efficient exploration.

*The authors acknowledge funding support by NSF through Award No. PHY - 2425180

Presenters

  • Sho Inaba

    • Department of Physics, Universtiy of Massachusetts Boston
    • University of Massachusetts Boston

Authors

  • Sho Inaba

    • Department of Physics, Universtiy of Massachusetts Boston
    • University of Massachusetts Boston
  • Jacob Adamczyk

    • University of Massachusetts Boston
  • Rahul V Kulkarni

    • Department of Physics, University of Massachusetts Boston
    • University of Massachusetts Boston