Overview of the steering method.

Ultra fast reinforcement learning demonstrated at CERN AWAKE

** Simon Hirlaender, Lukas Lamminger, Giovanni Zevi Della Porta, Verena Kain** Abstract Reinforcement learning (RL) is a promising direction in machine learning for the control and optimisation of particle accelerators since it learns directly from experience without needing a model a-priori. However, RL generally suffers from low sample efficiency and thus training from scracth on the machine is often not an option. RL agents are usually trained or pre-tuned on simulators and then transferred to the real environment. In this work we propose a model-based RL approach based on Gaussian processes (GPs) to overcome the sample efficiency limitation. Our RL agent was able to learn to control the trajectory at the CERN AWAKE (Advanced Wakefield Experiment) facility, a problem of 10 degrees of freedom, within a few interactions only. To date, numerical optimises are used to restore or increase and stabilise the performance of accelerators. A major drawback is that they must explore the optimisation space each time they are applied. Our RL approach learns as quickly as numerical optimisers for one optimisation run, but can be used afterwards as single-shot or few-shot controllers. Furthermore, it can also handle safety and time-varying systems and can be used for the online stabilisation of accelerator operation.This approach opens a new avenue for the application of RL in accelerator control and brings it into the realm of everyday applications. ...

May 1, 2023 · 233 words · RL4AA Collaboration
Success rate of the various algorithms over initial beam intensity.

Automated Intensity Optimisation Using Reinforcement Learning at LEIR

N. Madysa, V. Kain, R. Alemany Fernandez, N. Biancacci, B. Goddard, F. M. Velotti CERN 13th Particle Accelerator Conference Abstract High intensities in the Low Energy Ion Ring (LEIR) at CERN are achieved by stacking several multi-turn injec- tions from the pre-accelerator Linac3. Up to seven consec- utive 200 μs long, 200 ms spaced pulses are injected from Linac3 into LEIR. Two inclined septa, one magnetic and one electrostatic, combined with a collapsing horizontal or- bit bump allows a 6-D phase space painting via a linearly ramped mean momentum along the Linac3 pulse and in- jection at high dispersion. The already circulating beam is cooled and dragged longitudinally via electron cooling (e- cooling) into a stacking momentum to free space for the fol- lowing injections. For optimal intensity accumulation, the electron energy and trajectory need to match the ion energy and orbit at the e-cooler section. ...

June 12, 2022 · 264 words · RL4AA Collaboration
RL environment for beam optimisation in theARES EA.

First Steps Toward an Autonomous Accelerator, A Common Project Between DESY and KIT

A. Eichler1, F. Burkart1, J. Kaiser1, W. Kuropka1, O. Stein1, E. Bründermann2, A. Santamaria Garcia2, C. Xu2 1Deutsches Elektronen-Synchrotron DESY, 2Karlsruhe Institute of Technology KIT 12th International Particle Accelerator Conference Abstract Reinforcement learning algorithms have risen in pop-ularity in the accelerator physics community in recentyears, showing potential in beam control and in the opti-mization and automation of tasks in accelerator operation.The Helmholtz AI project “Machine Learning Toward Au-tonomous Accelerators” is a collaboration between DESYand KIT that works on investigating and developing rein-forcement learning applications for the automatic start-upof electron linear accelerators. The work is carried out inparallel at two similar research accelerators: ARES at DESYand FLUTE at KIT, giving the unique opportunity of trans-fer learning between facilities. One of the first steps of thisproject is the establishment of a common interface betweenthe simulations and the machine, in order to test and applyvarious optimization approaches interchangeably betweenthe two accelerators. In this paper we present first results onthe common interface and its application to beam focusingin ARES as well as the idea of laser shaping with spatiallight modulators at FLUTE. ...

May 24, 2021 · 185 words · RL4AA Collaboration
Reinforcement learning agent joint with the physics-based polynomial neural network.

Physics-Enhanced Reinforcement Learning for Optimal Control

A. Ivanov, I. Agapov, A. Eichler, S. Tomin Deutsches Elektronen Synchrotron DESY 12th International Particle Accelerator Conference Abstract We propose an approach for incorporating acceleratorphysics models into reinforcement learning agents. The proposed approach is based on the Taylor mapping technique for the simulation of particle dynamics. The resulting computational graph is represented as a polynomial neural network and embedded into the traditional reinforcement learning agents. The application of the model is demonstrated in a nonlinear simulation model of beam transmission. The comparison of the approach with the traditional numerical optimization as well as neural networks-based agents demonstrates better convergence of the proposed technique. ...

May 21, 2021 · 110 words · RL4AA Collaboration
General feedback scheme using the CSR powersignal to construct both, the state and reward signal of the Markov decision process (MDP).

Feedback Design for Control of the Micro-Bunching Instability Based on Reinforcement Learning

T. Boltz, M. Brosi, E. Bründermann, B. Haerer, P. Kaiser, C. Pohl, P. Schreiber, M. Yan,T. Asfour, A.-S. Müller Karlsruhe Insitute of Technology KIT 10th International Particle Accelerator Conference Abstract The operation of ring-based synchrotron light sourceswith short electron bunches increases the emission of co-herent synchrotron radiation (CSR) in the THz frequencyrange. However, the micro-bunching instability resultingfrom self-interaction of the bunch with its own radiationfield limits stable operation with constant intensity of CSRemission to a particular threshold current. Above this thresh-old, the longitudinal charge distribution and thus the emittedradiation vary rapidly and continuously. Therefore, a fastand adaptive feedback system is the appropriate approach tostabilize the dynamics and to overcome the limitations givenby the instability. In this contribution, we discuss first effortstowards a longitudinal feedback design that acts on the RFsystem of the KIT storage ring KARA (Karlsruhe ResearchAccelerator) and aims for stabilization of the emitted THzradiation. Our approach is based on methods of adaptive con-trol that were developed in the field of reinforcement learningand have seen great success in other fields of research overthe past decade. We motivate this particular approach andcomment on different aspects of its implementation. ...

May 19, 2019 · 195 words · RL4AA Collaboration
Example of a simulation run.

Orbit Correction Studies Using Neural Networks

E. Meier, Y.-R. E. Tan, G. S. LeBlanc Australian Synchrotron 3rd International Particle Accelerator Conference Abstract This paper reports the use of neural networks for orbitcorrection at the Australian Synchrotron Storage Ring. Theproposed system uses two neural networks in an actor-criticscheme to model a long term cost function and computeappropriate corrections. The system is entirely based onthe history of the beam position and the actuators, i.e. thecorrector magnets, in the storage ring. This makes the sys-tem auto-tuneable, which has the advantage of avoiding themeasure of a response matrix. The controller will automat-ically maintain an updated BPM corrector response matrix.In future if coupled with some form of orbit response anal-ysis, the system will have the potential to track drifts orchanges to the lattice functions in ”real time”. As a genericand robust orbit correction program it can be used duringcommissioning and in slow orbit feedback. In this study,we present positive initial results of the simulations of thestorage ring in Matlab. ...

May 20, 2012 · 165 words · RL4AA Collaboration