A Model-Based Reinforcement Learning Approach for Beta Control
POSTER
Abstract
The goal in Reinforcement Learning (RL) is to learn from data a controller that can take optimal actions in an environment. Recently, RL has produced astounding results in a variety of applications, including the game of Go, video games, and robotic control. In this work, we explore whether recent strides in this field can be applied to the plasma physics setting to derive a high-performing controller for Beta target tracking.
Because RL algorithms are notoriously data hungry and physical simulations of plasma are computationally expensive, we take a model-based approach. We first learn a model of plasma dynamics using historical shot data from DIII-D, and we use this model to train an RL agent. The dynamics model predicts how several key plasma quantities change given an action. The RL agent treats the model as if it were the true environment, and we train the agent to track a specified target for betan. We evaluate the model-derived controllers learned on the predictive TRANSP code and provide a comparison against a tuned PID controller.
Because RL algorithms are notoriously data hungry and physical simulations of plasma are computationally expensive, we take a model-based approach. We first learn a model of plasma dynamics using historical shot data from DIII-D, and we use this model to train an RL agent. The dynamics model predicts how several key plasma quantities change given an action. The RL agent treats the model as if it were the true environment, and we train the agent to track a specified target for betan. We evaluate the model-derived controllers learned on the predictive TRANSP code and provide a comparison against a tuned PID controller.
*This work was supported by DE-FC02-04ER54698 (DIII-D Cooperative Agreement) and DE-SC0021414.This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE1745016. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Presenters
-
Ian Char
- Carnegie Mellon University