Comparing deep reinforcement-learning techniques: applications to quantum memory
ORAL
Abstract
In our recent work [1] we showed how reinforcement learning with artificial neural networks (ANNs) can be a powerful tool to discover quantum-error-correction strategies fully adapted to the quantum hardware of a quantum memory. We employed a reinforcement-learning technique called natural policy gradient, in which the policy of the ANN is updated and improved according to the second-order gradient of the return (the cumulative sum of the reward) in the parameter space of the ANN.
The principal downsides of policy gradient are sample inefficiency and slow convergence, which can be critical in the case of a quantum system with an exponentially growing Hilbert space that is simulated classically. Here we conduct an in-depth study of the performance of more advanced reinforcement-learning techniques [2] applied to a noisy quantum memory. We find that the efficiency of training can be sped up by orders of magnitude via a careful choice of the technique and the corresponding hyperparameters, both of which are motivated by and related to the underlying physics.
[1] T. Fösel, P. Tighineanu, T. Weiß, F. Marquardt, PRX 8, 031084 (2018).
[2] P. Dhariwal, C. Hesse, O. Klimov, et al., OpenAI Baselines, https://github.com/openai/baselines.
The principal downsides of policy gradient are sample inefficiency and slow convergence, which can be critical in the case of a quantum system with an exponentially growing Hilbert space that is simulated classically. Here we conduct an in-depth study of the performance of more advanced reinforcement-learning techniques [2] applied to a noisy quantum memory. We find that the efficiency of training can be sped up by orders of magnitude via a careful choice of the technique and the corresponding hyperparameters, both of which are motivated by and related to the underlying physics.
[1] T. Fösel, P. Tighineanu, T. Weiß, F. Marquardt, PRX 8, 031084 (2018).
[2] P. Dhariwal, C. Hesse, O. Klimov, et al., OpenAI Baselines, https://github.com/openai/baselines.
–
Presenters
-
Petru Tighineanu
Max Planck Institute for the Science of Light
Authors
-
Petru Tighineanu
Max Planck Institute for the Science of Light
-
Thomas Foesel
Max Planck Institute for the Science of Light
-
Talitha Weiss
IQOQI, University of Innsbruck, Institute for Quantum Optics and Quantum Information
-
Florian Marquardt
Max Planck Institute for the Science of Light, Max Planck Institute for the Science of Light, Staudtstrasse 2, 91058 Erlangen, Germany