An Extension of Entropy-Regularized Reinforcement Learning to Reducible Dynamics

Sho Inaba; Jacob Adamczyk; Rahul Kulkarni

An Extension of Entropy-Regularized Reinforcement Learning to Reducible Dynamics

Poster-In-person

Abstract

Entropy-regularized reinforcement learning (ER-RL) effectively promotes both robustness and exploration in environments with stochastic dynamics by incorporating an entropic penalty term into the objective function. Recent work has shown how the objective in ER-RL can be optimized using approaches and results from non-equilibrium statistical mechanics (NESM).

However, this mapping relies on the Perron-Frobenius theorem, which applies in the case of Markov decision processes (MDPs) with irreducible transition matrices. For MDPs with reducible dynamics, we cannot invoke the Perron-Frobenius theorem, and correspondingly it is not clear if approaches derived from NESM can be applied to find optimal policies in ER-RL. To address these challenges, we propose a theoretical framework for extending ER-RL to environments with reducible dynamics. Our approach modifies the twisted generator of MDPs while preserving key insights from the Perron-Frobenius theorem. Through both theoretical analysis and extensive empirical evaluation, we show that our method leads to analytical results for optimal policies in ER-RL for systems with reducible dynamics. This work opens new avenues for addressing several important issues in ER-RL ranging from insights into the role of discounting to strategies for efficient exploration.

March 17, 2026, 2:00 PM – March 17, 2026, 2:00 PM

· 320

Presenters

Sho Inaba
- University of Massachusetts Boston

Authors

Sho Inaba
- University of Massachusetts Boston
Jacob Adamczyk
Rahul Kulkarni
- University of Massachusetts Boston