54th Annual Meeting of the APS Division of Atomic, Molecular and Optical Physics
Volume 68, Number 7
Monday–Friday, June 5–9, 2023;
Spokane, Washington
Session S10: Hybrid Quantum Systems
10:30 AM–11:54 AM,
Thursday, June 8, 2023
Room: 207
Chair: Raghavendra Srinivas, University of Oxford/Oxford Ionics
Abstract: S10.00006 : Improved scalable strategies for fast, high-fidelity, and long-distance entanglement distribution*
11:30 AM–11:42 AM
Abstract
Presenter:
Stav Haldar
(Louisiana State University)
Authors:
Stav Haldar
(Louisiana State University)
Pratik J Barge
(Louisiana State University)
Sumeet Khatri
(Freie Universität Berlin)
Hwang Lee
(Louisiana State University)
Near-term implementations of entanglement distribution in quantum networks must overcome current hardware limitations such as link losses, non-ideal measurements, and quantum memories with low coherence time. In this work, we show that the optimization of figures of merit such as the waiting time and fidelity for the end-to-end entanglement can be formulated in terms of a Markov decision process or MDP. An optimal protocol, or policy, for entanglement distribution can then be determined using reinforcement learning (RL). In particular, we simulate a near-term quantum network for entanglement distribution along a linear chain of nodes, both for homogeneous and inhomogeneous chains, and optimize the figures of merit using RL. We quantify the trade-off between minimizing the waiting time and maximizing the end-to-end link fidelity. We use a model-independent algorithm called Q-learning, in which the learning agent not only finds an improved policy but does so while simultaneously learning how the network itself is functioning. Our key finding is that, in certain parameter regimes, our RL-based optimization scheme provides policies that are better than previously known policies, such as the so-called "swap-as-soon-as-possible" policy. Our improved policies are characterized by dynamic, state-dependent cut-offs and collaboration between the nodes. Notably, we introduce in this work novel quantifiers for the collaboration between the nodes. These quantifiers tell us how much "global" knowledge of the network every node has, specifically, how much knowledge two distant nodes have of each other's states, as this is an important consideration for the practical implementation of our RL-based policies. Ultimately, RL-based methods are limited by the size of the networks that can be computationally simulated efficiently. The other main contribution of our work is to overcome this limitation. We introduce a new method for nesting RL-based policies for small repeater chains in order to obtain improved policies for large, long-distance repeater chains, thus paving a way for the scalable practical implementation of long-distance entanglement distribution.
*This work was supported by the Army Research Office Multidisciplinary University Research Initiative (ARO MURI) through the grant number W911NF2120214.