Overview of the training loop and the structure of simulated environment

Trend-Based SAC Beam Control Method with Zero-Shot in Superconducting Linear Accelerator

X. Chen, X. Qi, C. Su, Y. He, Z. Wang, K. Sun, C. Jin, W. Chen, S. Liu, X. Zhao, D. Jia, M. Yi Chinese Academy of Sciences arXiv Abstract The superconducting linear accelerator is a highly flexiable facility for modern scientific discoveries, necessitating weekly reconfiguration and tuning. Accordingly, minimizing setup time proves essential in affording users with ample experimental time. We propose a trend-based soft actor-critic(TBSAC) beam control method with strong robustness, allowing the agents to be trained in a simulated environment and applied to the real accelerator directly with zero-shot....

May 23, 2023 · 244 words · RL4AA Collaboration
Overview of the orbit correction method.

Orbit Correction Based on Improved Reinforcement Learning Algorithm

X. Chen, Y. Jia, X. Qi, Z. Wang, Y. He Chinese Academy of Sciences Physical Review Accelerators and Beams Abstract Recently, reinforcement learning (RL) algorithms have been applied to a wide range of control problems in accelerator commissioning. In order to achieve efficient and fast control, these algorithms need to be highly efficient, so as to minimize the online training time. In this paper, we incorporated the beam position monitor trend into the observation space of the twin delayed deep deterministic policy gradient (TD3) algorithm and trained two different structure agents, one based on physical prior knowledge and the other using the original TD3 network architecture....

April 13, 2023 · 327 words · RL4AA Collaboration