EVE: EigenVector-Based Exploration

ORAL

Abstract

The question of how to efficiently explore an environment has long been a central focus in artificial intelligence. In reinforcement learning specifically, a natural approach to the exploration problem is to incentivize broad coverage of the possible states within the environment—i.e., we wish to encourage entropy-maximizing behavior. The standard way to accomplish this can be computationally expensive, as calculating the entropy normally requires running repeated simulations until the system reaches a steady state, or equilibrium. However, recent work has derived a closed-form expression for this steady-state distribution, suggesting a more efficient approach. Building on this result, we develop a new algorithm that maximizes entropy. We demonstrate our method's general applicability to the efficient exploration problem by finding "optimal exploration policies" in different domains.

Publication: EVE: EigenVector-Based Exploration (paper)

Presenters

  • Adam Kamoski

    • University of Massachusetts Boston

Authors

  • Adam Kamoski

    • University of Massachusetts Boston
  • Rahul V Kulkarni

    • Department of Physics, University of Massachusetts Boston
    • University of Massachusetts Boston
  • Sho Inaba

    • Department of Physics, Universtiy of Massachusetts Boston
    • University of Massachusetts Boston
  • Jacob Adamczyk

    • University of Massachusetts Boston