Schematic view of the GMPS control environment.

Real-time artificial intelligence for accelerator control: A study at the Fermilab Booster

J. St. John1, C. Herwig1, D. Kafkes1, J. Mitrevski1, W. A. Pellico1, G. N. Perdue1, A. Quintero-Parra1, B. A. Schupbach1, K. Seiya1, N. Tran1, M. Schram2, J. M. Duarte3, Y. Huang4, R. Keller5 1Fermi National Accelerator Laboratory, 2Thomas Jefferson National Accelerator Laboratory, 3University of California San Diego, 4Pacific Northwest National Laboratory, 5Columbia University Physical Review Accelerators and Beams Abstract We describe a method for precisely regulating the gradient magnet power supply (GMPS) at the Fermilab Booster accelerator complex using a neural network trained via reinforcement learning. We demonstrate preliminary results by training a surrogate machine-learning model on real accelerator data to emulate the GMPS, and using this surrogate model in turn to train the neural network for its regulation task. We additionally show how the neural networks to be deployed for control purposes may be compiled to execute on field-programmable gate arrays (FPGAs), and show the first machine-learning based control algorithm implemented on an FPGA for controls at the Fermilab accelerator complex. As there are no surprise latencies on an FPGA, this capability is important for operational stability in complicated environments such as an accelerator facility. ...

October 18, 2021 · 194 words · RL4AA Collaboration
Online training of NAF Agent of AWAKE electronline trajectory steering in the horizontal plane.

Test of Machine Learning at the CERN LINAC4

V. Kain1, N. Bruchon1, S. Hirlander1, N. Madysa1, I. Vojskovic1, P. Skowronski1, G. Valentino2 1CERN, 2University of Malta 61st ICFA ABDW on High-Intensity and High-Brightness Hadron Beams Abstract The CERN H−linear accelerator, LINAC4, served as atest bed for advanced algorithms during the CERN LongShutdown 2 in the years 2019/20. One of the main goals wasto show that reinforcement learning with all its benefits canbe used as a replacement for numerical optimization and asa complement to classical control in the accelerator controlcontext. Many of the algorithms used were prepared before-hand at the electron line of the AWAKE facility to makethe best use of the limited time available at LINAC4. Anoverview of the algorithms and concepts tested at LINAC4and AWAKE will be given and the results discussed. ...

October 4, 2021 · 132 words · RL4AA Collaboration
Best PPO agent. Action is deterministic.

Renovation of the beam-based feedback systems in the LHC

L. Grech University of Malta PhD thesis Abstract The Large Hadron Collider (LHC) at the European Organization for Nuclear Research (CERN) is the largest synchrotron built to date, having a circumference of approx- imately 27km. The LHC is able to accelerate two counter-rotating proton and/or heavy-ion beams up to 7 TeV per charge. These highly energetic beams are contained inside a vacuum chamber with an inner diameter of 80 mm by means of strong mag- netic fields produced by superconducting magnets. A beam cleaning and machine protection system is in place to prevent high-energy halo particles from impacting and heating the superconducting magnets. ...

September 1, 2021 · 655 words · RL4AA Collaboration
Hardware solution  for RL control.

Accelerated Deep Reinforcement Learning for Fast Feedback of Beam Dynamics at KARA

W. Wang1, M. Caselle1, T. Boltz1, E. Blomley1, M. Brosi1, T. Dritschler1, A. Ebersoldt1, A. Kopmann1, A. Santamaria Garcia1, P. Schreiber1, E. Bründermann1, M. Weber1, A.-S. Müller1, Y. Fang2 1Karlsruhe Insitute of Technology KIT, 2Northwestern Polytechnical University IEEE Transactions on Nuclear Science Abstract Coherent synchrotron radiation (CSR) is generated when the electron bunch length is in the order of the magnitude of the wavelength of the emitted radiation. The self-interaction of short electron bunches with their own electromagnetic fields changes the longitudinal beam dynamics significantly. Above a certain current threshold, the micro-bunching instability develops, characterized by the appearance of distinguishable substructures in the longitudinal phase space of the bunch. To stabilize the CSR emission, a real-time feedback control loop based on reinforcement learning (RL) is proposed. Informed by the available THz diagnostics, the feedback is designed to act on the radio frequency (RF) system of the storage ring to mitigate the micro-bunching dynamics. To satisfy low-latency requirements given by the longitudinal beam dynamics, the RL controller has been implemented on hardware (FPGA). In this article, a real-time feedback loop architecture and its performance is presented and compared with a software implementation using Keras-RL on CPU/GPU. The results obtained with the CSR simulation Inovesa demonstrate that the functionality of both platforms is equivalent. The training performance of the hardware implementation is similar to software solution, while it outperforms the Keras-RL implementation by an order of magnitude. The presented RL hardware controller is considered as an essential platform for the development of intelligent CSR control systems. ...

May 27, 2021 · 260 words · RL4AA Collaboration
RL environment for beam optimisation in theARES EA.

First Steps Toward an Autonomous Accelerator, A Common Project Between DESY and KIT

A. Eichler1, F. Burkart1, J. Kaiser1, W. Kuropka1, O. Stein1, E. Bründermann2, A. Santamaria Garcia2, C. Xu2 1Deutsches Elektronen-Synchrotron DESY, 2Karlsruhe Institute of Technology KIT 12th International Particle Accelerator Conference Abstract Reinforcement learning algorithms have risen in pop-ularity in the accelerator physics community in recentyears, showing potential in beam control and in the opti-mization and automation of tasks in accelerator operation.The Helmholtz AI project “Machine Learning Toward Au-tonomous Accelerators” is a collaboration between DESYand KIT that works on investigating and developing rein-forcement learning applications for the automatic start-upof electron linear accelerators. The work is carried out inparallel at two similar research accelerators: ARES at DESYand FLUTE at KIT, giving the unique opportunity of trans-fer learning between facilities. One of the first steps of thisproject is the establishment of a common interface betweenthe simulations and the machine, in order to test and applyvarious optimization approaches interchangeably betweenthe two accelerators. In this paper we present first results onthe common interface and its application to beam focusingin ARES as well as the idea of laser shaping with spatiallight modulators at FLUTE. ...

May 24, 2021 · 185 words · RL4AA Collaboration
Reinforcement learning agent joint with the physics-based polynomial neural network.

Physics-Enhanced Reinforcement Learning for Optimal Control

A. Ivanov, I. Agapov, A. Eichler, S. Tomin Deutsches Elektronen Synchrotron DESY 12th International Particle Accelerator Conference Abstract We propose an approach for incorporating acceleratorphysics models into reinforcement learning agents. The proposed approach is based on the Taylor mapping technique for the simulation of particle dynamics. The resulting computational graph is represented as a polynomial neural network and embedded into the traditional reinforcement learning agents. The application of the model is demonstrated in a nonlinear simulation model of beam transmission. The comparison of the approach with the traditional numerical optimization as well as neural networks-based agents demonstrates better convergence of the proposed technique. ...

May 21, 2021 · 110 words · RL4AA Collaboration
Simple scheme of the FERMI FEL seed laser alignment set up.

Feasibility Investigation on Several Reinforcement Learning Techniques to Improve the Performance of the FERMI Free-Electron Laser

N. Bruchon University of Trieste PhD thesis Abstract The research carried out in particle accelerator facilities does not concern only particle and condensed matter physics, although these are the main topics covered in the field. Indeed, since a particle accelerator is composed of many different sub-systems, its proper functioning depends both on each of these parts and their interconnection. It follows that the study, implementation, and improvement of the various sub-systems are fundamental points of investigation too. In particular, an interesting aspect for the automation engineering community is the control of such systems that usually are complex, large, noise-affected, and non-linear. ...

March 18, 2021 · 322 words · RL4AA Collaboration
Plot of the reward received by the agent versus step number.

Policy gradient methods for free-electron laser and terahertz source optimization and stabilization at the FERMI free-electron laser at Elettra

F. H. O’Shea1, N. Bruchon2, G. Gaio1 1Elettra Sincrotrone Trieste, 2University of Trieste Physical Review Accelerators and Beams Abstract In this article we report on the application of a model-free reinforcement learning method to the optimization of accelerator systems. We simplify a policy gradient algorithm to accelerator control from sophisticated algorithms that have recently been demonstrated to solve complex dynamic problems. After outlining a theoretical basis for the functioning of the algorithm, we explore the small hyperparameter space to develop intuition about said parameters using a simple number-guess environment. Finally, we demonstrate the algorithm optimizing both a free-electron laser and an accelerator-based terahertz source in-situ. The algorithm is applied to different accelerator control systems and optimizes the desired signals in a few hundred steps without any domain knowledge using up to five control parameters. In addition, the algorithm shows modest tolerance to accelerator fault conditions without any special preparation for such conditions. ...

December 21, 2020 · 160 words · RL4AA Collaboration
A schematic overview of theAE-DYNAapproach used in this paper.

Model-free and Bayesian Ensembling Model-based Deep Reinforcement Learning for Particle Accelerator Control Demonstrated on the FERMI FEL

S. Hirlaender1, N. Bruchon2 1University of Salzburg, 2University of Trieste arXiv Abstract Reinforcement learning holds tremendous promise in accelerator controls. The primary goal of this paper is to show how this approach can be utilised on an operational level on accelerator physics problems. Despite the success of model-free reinforcement learning in several domains, sample-efficiency still is a bottle-neck, which might be encompassed by model-based methods. We compare well-suited purely model-based to model-free reinforcement learning applied to the intensity optimisation on the FERMI FEL system. We find that the model-based approach demonstrates higher representational power and sample-efficiency, while the asymptotic performance of the model-free method is slightly superior. The model-based algorithm is implemented in a DYNA-style using an uncertainty aware model, and the model-free algorithm is based on tailored deep Q-learning. In both cases, the algorithms were implemented in a way, which presents increased noise robustness as omnipresent in accelerator control problems. ...

December 17, 2020 · 158 words · RL4AA Collaboration
Policy network maps states to actions.

Autonomous Control of a Particle Accelerator using Deep Reinforcement Learning

X. Pang1, S. Thulasidasan2, L. Rybarcyk2 1Apple, 2Los Alamos National Laboratory Machine Learning for Engineering Modeling, Simulation, and Design Workshop at Neural Information Processing Systems 2020 Abstract We describe an approach to learning optimal control policies for a large, linear particle accelerator using deep reinforcement learning coupled with a high-fidelity physics engine. The framework consists of an AI controller that uses deep neural networks for state and action-space representation and learns optimal policies using reward signals that are provided by the physics simulator. For this work, we only focus on controlling a small section of the entire accelerator. Nevertheless, initial results indicate that we can achieve better-than-human level performance in terms of particle beam current and distribution. The ultimate goal of this line of work is to substantially reduce the tuning time for such facilities by orders of magnitude, and achieve near-autonomous control. ...

December 12, 2020 · 150 words · RL4AA Collaboration