EVE: EigenVector-Based Exploration
ORAL
Abstract
The question of how to efficiently explore an environment has long been a central focus in artificial intelligence. In reinforcement learning specifically, a natural approach to the exploration problem is to incentivize broad coverage of the possible states within the environment—i.e., we wish to encourage entropy-maximizing behavior. The standard way to accomplish this can be computationally expensive, as calculating the entropy normally requires running repeated simulations until the system reaches a steady state, or equilibrium. However, recent work has derived a closed-form expression for this steady-state distribution, suggesting a more efficient approach. Building on this result, we develop a new algorithm that maximizes entropy. We demonstrate our method's general applicability to the efficient exploration problem by finding "optimal exploration policies" in different domains.
–
Publication: EVE: EigenVector-Based Exploration (paper)
Presenters
-
Adam Kamoski
- University of Massachusetts Boston