Reinforcement Learning and the Cost of Observation
ORAL
Abstract
Reinforcement learning has recently been used for practical application in both physics and chemistry laboratories. When conducting experiments in these environments, measurements can often be expensive, either due to the resources used or the time that is taken. Due to this, measurements are typically taken intermittently or when exploring new conditions. If we were to use reinforcement learning to optimise a laboratory experiment, it would ideally act in a similar manner. Therefore, it would need to act without perfect information and balance the cost of observation with the need for new data.
In this work, we show how an agent can build an internal hypothesis of its environment, using experience from past measurements, that it can then act on. The measurements it will obtain come from various sensors that each have an associated cost. The informativeness will vary given the current state, so the agent needs to determine the most valuable information to sample. We also demonstrate how the agent’s behaviour changes when certain measurements are limited by cost.
In this work, we show how an agent can build an internal hypothesis of its environment, using experience from past measurements, that it can then act on. The measurements it will obtain come from various sensors that each have an associated cost. The informativeness will vary given the current state, so the agent needs to determine the most valuable information to sample. We also demonstrate how the agent’s behaviour changes when certain measurements are limited by cost.
–
Presenters
-
Rory Coles
University of Ontario Institute of Technology
Authors
-
Rory Coles
University of Ontario Institute of Technology
-
Colin Bellinger
National Research Council of Canada
-
Isaac Tamblyn
Natl Res Council, National Research Council of Canada