Reinforcement Learning Meets Quantum Control - Artificially Intelligent Maxwell's Demon

ORAL

Abstract

Feedback control of quantum systems is of fundamental importance for practical applications in various contexts, ranging from quantum computation to quantum error correction and quantum metrology. However, deriving optimal feedback control strategies is highly challenging, particularly when it involves the optimal control of open quantum systems, the stochastic nature of quantum measurement, and the inclusion of policies that maximize a long-term time- and trajectory-averaged goal.

In our work, we employ a reinforcement learning approach to automate and capture the role of a quantum Maxwell’s demon: a neural network takes the literal role of discovering optimal control-feedback strategies in qubit-based quantum systems that maximize the tradeoff between measurement-powered cooling and measurement efficiency. We explore different operational regimes based on the ordering between thermalization, measurement, and unitary feedback timescales, finding different and highly non-intuitive, yet interpretable, strategies.

*PAE gratefully acknowledges funding by the Berlin Mathematics Center MATH+ (AA2-18). JE has been supported by the DFG (FOR 2724, CRC 183), the BMBF (QSolid), and the ERC (DebuQC). RC, BB and ANJ acknowledge the support of Chapman University, U. S. Army Research Office under grant W911NF-22-1-0258 and the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences (BES), under Award No. DESC0017890. J.E. acknowledges funding by the DFG (FOR 2724, for which this is an inter-node collaboration reaching an important milestone, and CRC 183), the FQXI, the Quantum Flagship (Millenion, for which is again the result of an inter-node collaboration), the BMBF (DAQC), and the ERC (DebuQC). FN gratefully acknowledges funding by the BMBF (Berlin Institute for the Foundations of Learning and Data—BIFOLD), the European Research Commission (ERC CoG 772230) and the Berlin Mathematics Center MATH+ (AA1-6, AA2-8).

Publication: arXiv:2408.15328v1 [quant-ph]

Presenters

  • Robert Czupryniak

    • University of Rochester

Authors

  • Robert Czupryniak

    • University of Rochester
  • Paolo A Erdman

    • Freie Universität Berlin
  • Bibek Bhandari

    • Chapman University
  • Andrew N Jordan

    • Chapman University
  • Jens Eisert

    • Freie Universität Berlin
    • FU Berlin
  • Frank Noe

    • Microsoft Corporation
  • GIACOMO GUARNIERI

    • Freie University Berlin
    • University of Pavia