- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Deep learning continues to transform science at an unprecedented pace, driving breakthroughs that were once unimaginable. Whether you’re exploring the mysteries of the universe, pushing the boundaries of molecular biology, or working on cutting-edge computational techniques, deep learning is likely at the heart of it. Now, it’s time to share those innovations with your peers.
For the 7th year in a row, we invite you to the Round Table on Deep Learning @DESY—a one-of-a-kind opportunity to connect with fellow researchers across diverse disciplines who are using the same AI tools to solve vastly different problems. This is more than just a meeting; it’s a chance to spark collaborations, ignite new ideas, and expand your network of AI experts right here on campus.
If you're passionate about the potential of machine learning, deep learning, and AI, this is the place to be. Researchers from DESY, UHH, EMBL, CSSB, XFEL, MPSD, Hereon, and beyond are coming together to exchange knowledge, highlight breakthroughs, and discuss the future of AI-powered research. Don't miss out!
What to expect:
Register below to ensure you’re part of this unique exchange of ideas!
Call for contributions: Got a breakthrough to share? We want to hear how you’re using deep learning to propel your research forward. Submit a one-paragraph abstract via Indico by 10 Nov 2024.
Mark your calendar: 22 Nov 2024 is the day you won’t want to miss!
Organisers: Philipp Heuser (Helmholtz Imaging/DESY), Engin Eren (Helmholtz Imaging/DESY)
LEAPS – the League of European Accelerator-based Photon Sources – is a strategic consortium initiated by the Directors of the Synchrotron Radiation and Free Electron Laser user facilities in Europe. Its primary goal is to actively and constructively ensure and promote the quality and impact of fundamental, applied and industrial research carried out at each facility to the greater benefit of European science and society. Within LEAPS, a special interest group has been established, focusing on artificial intelligence and machine learning.
This presentation provides an overview of this AI/ML group, how you can get involved and the benefits from doing so.
Maxwell, GPUs and the future of AI computing in the DESY compute center
Pelagic imaging, the capture of images of plankton and particles in the open-water zones of the oceans, is central for understanding plankton diversity, distribution, and dynamics on a large scale.The integration of deep learning (DL) into pelagic imaging workflows offers the potential to improve the precision and scalability of image-based analyses in plankton research.
This talk will first review our recent work in advancing automated image analysis methods tailored to address the challenges of species classification, trait extraction, and image annotation in plankton studies, focusing on the development of MorphoCluster, a cluster-based, interactive image classification tool. Key results indicate that MorphoCluster surpasses traditional methods in sorting accuracy and annotation speed.
Building on this foundation, the talk will outline GEOMAR’s part in the upcoming HFMI AqQua project, a joint initiative of Helmholtz institutes to create a versatile deep learning model for plankton image recognition, trained on billions of images from diverse instruments.
Foundation models are multi-dataset and multi-task machine learning methods that once pre-trained can be fine-tuned for a large variety of downstream applications. The successful development of such general-purpose models for physics data would be a major breakthrough as they could improve the achievable physics performance while at the same time drastically reduce the required amount of training time and data.
We report significant progress on this challenge on several fronts. First, a comprehensive set of evaluation methods is introduced to judge the quality of an encoding from physics data into a representation suitable for the autoregressive generation of particle jets with transformer architectures (the common backbone of foundation models). These measures motivate the choice of a higher-fidelity tokenization compared to previous works.
Finally, we demonstrate transfer learning between an unsupervised problem (jet generation) and a classic supervised task (jet tagging) with our new OmniJet-$\alpha$ model. This is the first successful transfer between two different and actively studied classes of tasks and constitutes a major step in the building of foundation models for particle physics.
Based on the LLM (Large Language Model) and RAG (Retrieval-Augmented Generation) technologies, the FS-EC group has developed a technical documentation and code retrieval Q&A AI Agent. This Agent will be further enhanced by integrating historical Q&A information from the ticket system, aiming to co-develop a professional Q&A AI Agent for beam scientists and users.
This talk presents applications of Large Language Model (LLM)-powered tools for enhancing daily accelerator operation. First, an overview of LLM tools that utilize Retrieval Augmented Generation (RAG) techniques is provided, demonstrating how existing knowledge bases, such as electronic logbooks, can be leveraged. Additionally, an advanced ReAct prompting approach (Reasoning and Action) is introduced, enabling the development of interactive assistant agents with enhanced problem-solving capabilities.
Over the last year a group of imaging enthusiasts has met to discuss
current issues of phase contrast imaging and tomography at PETRA III. I
will give a short introduction to the experimental modality and
introduce some of the challenges we investigate currently.
In near-field imaging, accurate phase retrieval is crucial for reconstructing complex wavefronts, with applications in optics, microscopy, and X-ray imaging. The beamlines at PETRA III, DESY, like many advanced imaging facilities, involve various inverse problems, including computed tomography, phase retrieval, and image deblurring. Among these, phase retrieval stands out as a non-linear, ill-posed inverse problem that is essential for accurate wavefront reconstruction prior to tomographic analysis. Traditional methods for phase retrieval often lack robustness, particularly under noisy or limited-data conditions. Here, we introduce ForwardNET, a family of generative neural networks specifically designed to address these challenges by learning a forward model that describes the propagation process without needing ground truth data for training. By leveraging this model-driven framework, ForwardNET enables quantitative phase retrieval with high precision. Experimental results demonstrate ForwardNET’s superior accuracy and robustness across diverse imaging scenarios, underscoring its potential to significantly enhance phase retrieval and image quality in X-ray imaging and beyond.
In this talk, we're exploring possible data-driven ML methods to make quality estimation of retrieved phases in the context of near-field holograpy.
Near-field holography imaging is essential in science and industry for high-resolution imaging at nanostructures and microscopic scales, but it is highly sensitive to noise, which varies depending on both the detector type and the exposure time. This study introduces a machine learning based denoising method using dilated convolutional neural networks (DnCNN), which effectively reduces noise while preserving spatial details. By training the network with a custom loss function combining MS-SSIM and L1 norm, this approach captures local and global image context to distinguish signal from noise. Experimental results demonstrate that this denoising method significantly removes noise from low-dose holography images across three detectors—Lambda, Eiger, and Zyla—each with distinct noise characteristics due to their photon-counting and sCMOS technologies. The approach effectively reduces noise while preserving critical spatial details, facilitating improved analysis and interpretation in various scientific and industrial applications.
We present a deep learning approach based on the Noise2Noise framework to denoise multidimensional photoemission spectroscopy (MPES) data obtained with a time-of-flight momentum microscope. Specifically, a 3D U-Net architecture is trained using low- and high-count noisy data, enabling the model to learn noise characteristics without requiring clean images. Our approach excels at reconstructing images even at extremely low count levels (order of 10^-3 counts/pixel), where conventional denoising techniques simply fail. Tests show that a 10-min acquisition processed with our deep learning model resolves major features not even visible after multiple hours of measurement. The presented approach has the potential to streamline the MPES data acquisition process at table-top/laboratory sources as well as large-scale facilities like FEL FLASH. By utilizing our method in future studies, researchers will be able to efficiently optimize acquisition parameters; thus, significant beamtime could be conserved, or an existing beamtime budget could be used more effectively, allowing for the exploration of a broader parameter space.
As part of our correlative characterisation studies of biodegradable metal bone implants we have performed both synchrotron-radiation microtomography (SR-µCT) and histology sequentially on the same samples and regions of interest. Histological staining is still the gold standard for tissue visualisation yet requires multiple time-consuming sample preparation steps (fixing, embedding, sectioning and staining) before imaging is performed on individual slices, in contrast to the non-invasive and 3D nature of x-ray tomography. In the process of correlating the corresponding data sets, we are able to combine advantages of both modalities by training machine learning networks for modality transfer on SR-µCT/histology pairs to generate artificially stained 3D virtual histology data from SR-µCT datasets, with promising preliminary results.
In materials science research, digital volume correlation (DVC) analysis is commonly used to track deformations and strains to elucidate morphology-function relationships. Recently, we proposed the neural network, VolRAFT, which estimated the 3D displacement vector between the reference volume and the deformed volume by extending the state-of-the-art optical flow network from 2D images to 3D volumes. However, this VolRAFT approach is limited by the available GPU memory due to the increased data dimensionality. Hence, in this talk, we will introduce a novel approach that extends VolRAFT by multi-scale volumetric blending to allow full-volume network training and inference.
With the high brilliance and ultrashort pulses of X-ray Free Electron Lasers, Serial Femtosecond Crystallography (SFX) achieved atomic-resolution for micro and nano protein crystals. Throughout the data collection the beam is prone to fluctuations caused by the self-amplified spontaneous emission process which generates the beam and is intrinsically a stochastic phenomenon. These fluctuations affect photon energy, pulse duration, and intensity of the beam. Although monitoring tools exist to track the beam, uncertainties in each SFX measurement always remain, due to unknowns such as beam focus and sample position, which are critical but hard to estimate. Using X-ray emission spectroscopy alongside deep neural networks (DNNs) we can estimate beam parameters from emission spectra. By training our model on plasma simulations from a protein crystal, we aim to predict photon energies ranging from 6 to 12 keV, fluences from 5102 to 5105 J/cm2 , and pulse durations from 3 to 30 fs. By calculating the saliency maps from the spectras in our test dataset we aim to identify spectral regions most informative to the DNN model, which might allow us to better interpret the model’s output and validate its performance for future real time monitoring on experimental data.
X-ray--induced Coulomb explosion imaging is one promising method to perform single-particle molecular imaging on a femtosecond timescale. By firing an intense ultrashort XFEL pulse at single molecules, it gets strongly ionized and violently dissociates into atomic fragments that are measured in coincidence. However, due to the finite detection efficiency in the experiment, the collected data is both unstructured and missing information, which precludes its analysis and interpretation. To get a coherent and informative picture of the data despite the challenges inherent to the experimental data, we developed a new analysis method inspired by machine learning techniques. We applied this methodology to the Coulomb explosion of 2-iodopyridine (C$_5$H$_4$NI), which allows us, in conjunction with simulations, to demonstrate that the experiment is measuring fingerprints of the collective nature of the quantum ground-state fluctuations of the molecule.
Virtual diagnostics can provide complementary diagnostics, by combining information from several sources, thereby profiting from the advantages of each one. To this end, we present the Virtual Spectrometer, which maps data from a low-resolution time-of-flight spectrometer to a high-resolution one. While the low-resolution spectrometer is non-invasive, can operate at 4.5 MHz and has complex calibration, the high-resolution spectrometer is invasive, operates at 10 Hz and has a simpler calibration procedure. By combining the two through data science methods, a virtual spectrometer with higher resolution than the time-of-flight spectrometer is obtained, while maintaining its other benefits. After a short setup and training period with the invasive grating spectrometer, it is removed from the beamline. The resulting virtual spectra are obtained at 4.5 MHz non-invasively with an up to 40% increased resolution, with respect to the time-of-flight spectrometer. The Virtual Spectrometer can use a Bayesian linear fit, or a Bayesian Neural Network to perform the fit, depending on the fit time requirements.
Plasma-based accelerators hold the potential to achieve mulit-giga-volt-per-metre accelerating gradients, offering a promising route to more compact and cost-effective accelerators for future light sources and colliders. However, plasma wakefield acceleration (PWFA) is often a nonlinear, high-dimensional process that is sensitive to jitters in multiple input parameters, making the setup, operation and diagnosis of a PWFA stage a challenging task. To tackle some of these issues, Machine Learning techniques have gained popularity in the field of plasma acceleration. This talk provides a brief overview of how Machine Learning methods are being applied at FLASHForward, a beam-driven plasma wakefield accelerator test-bed based at DESY, Hamburg, with emphasis on those related to deep learning.
In machine learning, the ability to make reliable predictions is paramount. Yet, standard ML models and pipelines provide only point predictions without accounting for model confidence (or the lack thereof). Uncertainty in model outputs, especially when faced with out-of-distribution (OOD) data, is essential when deploying models in production. This talk serves as an introduction to the concepts and techniques for quantifying uncertainty in machine learning models. We will explore the different sources of uncertainty and cover various methods for estimating these uncertainties effectively. By understanding and addressing uncertainty, particularly in the context of OOD data, practitioners can enhance the robustness of their models and foster greater confidence in model predictions.
The European XFEL is a complex machine building on hundreds of subsystems, which might require frequent calibration. The automation of the latter frees operators’ time and potentially increases the exploitation of allotted beamtime.
Three use-cases are shown in this presentation. A first use-case takes advantage of Bayesian Optimization to spatially align an optical laser to a camera. A second presented example shows the usage of the Mutual Information and Bayesian Optimization to optimize a coordinate transformation for data analysis. Finally, a third example uses Computer Vision to detect conditions that could damage imagers and take action if necessary.
The talk focuses on an on-going effort that aims to predict the x-ray pulse properties from machine settings and available diagnostics via a surrogate model. While still at an early stage, preliminary results already provide useful insights into the correlation between electron bunches and x-ray spectral properties at MHz repetition rates. The goal of the program is not only to provide a surrogate model of the machine, but also to allow for its inversion; i.e. providing a systematic method to obtain machine setting ranges that produce the desired photon beam properties.
Virtual diagnostic tools leveraging readily available input data offer a non-invasive way to optimize Free-Electron Lasers (FEL) operation and delivery, especially when limitations with conventional diagnostics arise. This work presents a novel approach using an artificial neural network to online predict photon pulse pointing at MHz level for both soft and hard x-rays. The model input is based purely on parasitically available diagnostic of both the electron and the photon beam. The model is validated by diamond sensor measurements at 11~keV, achieving a correlation coefficient greater than 0.95. This virtual diagnostic not only streamlines beam alignment and optimization, but is also the funding stone of a MHz-capable beam pointing stabilization. Furthermore, it further improves the online characterization of each photon pulse at MHz level.
At the LHC, collision data events are produced every 25 ns. To handle these large data streams, the CMS trigger system filters events in real time. The first stage of that system, the Level-1 trigger, is implemented in hardware using FPGAs. We present a novel ML-based anomaly detection algorithm that has been integrated in the Level-1 Trigger and successfully taken data during the 2024 pp collisions of CMS.
The likelihood ratio (LR) plays an important role in statistics and many domains of science. The Neyman-Pearson lemma states that it is the most powerful test statistic for simple statistical hypothesis testing problems [1] or binary classification problems. Likelihood ratios are also key to Monte Carlo importance sampling techniques [2]. Unfortunately, in many areas of study the probability densities comprising the likelihood ratio are defined by implicit models, and so are intractable to compute explicitly [3].
Neural based LR estimation using probabilistic classification has therefore had a significant impact in these domains, providing a scalable method for determining an intractable LR from simulated datasets via the so-called ratio trick [4, 5]. These approaches typically adhere to the standard Kolmogorov axioms of probability theory [6]. In particular, they assume the first axiom: the probability of an event is a non-negative real number. However, there are settings in which synthetically generated data (e.g. Monte Carlo sampling) $\{(x_{i}, w_{i})\}^{N}_{i=1}$ contains weights that are negative $w_{i} < 0$ [7, 8]. These negative weights are a symptom of a class of distribution known as quasiprobabilities, which do not adhere to the first Kolmogorov axiom. Consequently, the probabilistic-like distribution has a negative density [9]; $q(x) < 0$ for some x.
In high energy physics, negative weights/densities are a commonly observed feature of Monte Carlo simulated proton-proton (pp) collision datasets [10-13]. Whether it be due to quantum interference between Standard Model and new physics processes, or algorithms that match/merge matrix element calculations of beyond leading order Quantum Chromodynamic processes with parton showers, Monte Carlo simulation codes often introduce negatively weighted data.
This work will present a general approach to extending the neural based LR trick to quasiprobabilistic distributions. It will demonstrate that a new loss function, combined with signed probability measures (Hahn-Jordan decomposition), can be used to decompose the likelihoods into signed mixture models. A quasiprobabilistic analog of the Likelihood Ratio is then constructed using a ratio of signed mixture models. The technique is demonstrated using di-Higgs production via gluon-gluon fusion in $pp$ collisions at the Large Hadron Collider [14].
References:
[1] Jerzy Neyman, Egon Sharpe Pearson, and Karl Pearson. Ix. on the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 231(694-706):289–337, 1933.
[2] Christian Lemieux. Monte Carlo and Quasi-Monte Carlo Sampling. Springer, New York, NY, USA, 2009.
[3] Peter J. Diggle and Richard J. Gratton. Monte carlo methods of inference for implicit statistical models. Journal of the Royal Statistical Society: Series B (Methodological), 46(2):193–212, 1984.
[4] Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori. Density Ratio Estimation in Machine Learning. Cambridge University Press, 2012.
[5] Kyle Cranmer, Juan Pavez, and Gilles Louppe. Approximating likelihood ratios with calibrated discriminative classifiers, 2016.
[6] A.N. Kolmogorov. Grundbegriffe der Wahrscheinlichkeitsrechnung. Number 1. Springer Berlin, Heidelberg, 1933.
[7] Stefano Frixione and Bryan R Webber. Matching nlo qcd computations and parton shower simulations. Journal of High Energy Physics, 2002(06):029–029, June 2002.
[8] Paolo Nason and Giovanni Ridolfi. A positive-weight next-to-leading-order monte carlo forzpair hadroproduction. Journal of High Energy Physics, 2006(08):077–077, August 2006.
[9] Richard Phillips Feynman. Negative probability. 1984.
[10] Lyndon Evans and Philip Bryant. Lhc machine. Journal of Instrumentation, 3(08):S08001, aug 2008.
[11] ATLAS collaboration. Modelling and computational improvements to the simulation of single vector-boson plus jet processes for the atlas experiment. Journal of High Energy Physics, 2022(8), August 2022.
[12] Andrea Valassi, Efe Yazgan, et al, 'Challenges in monte carlo event
generator software for high-luminosity LHC.' Computing and Software for Big Science, 5(1), May 2021.
[13] ATLAS. Study of ttbb and ttW background modelling for ttH analyses. 2022.
[14] Amos Breskin and Rudiger Voss. The CERN Large Hadron Collider: Accelerator and Experiments. CERN, Geneva, 2009.
A normalising flow is a stochastic tool that can be used for generative modelling and reconstruction. While not the lightest models in the toolbox, normalising flows are often very accurate and their bi-directionality can be uniquely advantageous. Literature that guides architecture and design choice for users of these models is focused on non-HEP applications, and optimal results in HEP require rethinking those guidelines. We leverage toy models, and our experience with real HEP use, to provide guidance targeted at HEP users.
Searches for physics beyond the Standard Model at the Large Hadron collider usually rely on phenomena that affect leptons, photons or jets with high transverse momenta (> 15 GeV).
Alongside these hard physics objects, proton-proton collisions produce a multitude of soft ones, which are known as the underlying event. This work focuses on the search of anomalies among the soft physics objects, a phase space not studied before, which would hint at the existence of particular new phenomena. A feasibility study is currently performed on Monte Carlo simulations, using CATHODE, a model agnostic search strategy that uses outer density estimations to detect anomalies. First results and encountered challenges will be presented.
Ever-increasing collision rates place significant computational stress on the simulation of future experiments in high energy physics. Generative machine learning (ML) models have been found to speed up and augment the most computationally intensive part of the traditional simulation chain: the calorimeter simulation. Many previous studies relied on fixed grid-like data representation of electromagnetic showers, which leads to artifacts when applied to highly granular calorimeters due to the aperiodic tiling of cells in realistic detector geometry. With this contribution, we present CaloClouds III, an updated version of the novel point cloud diffusion model, CaloClouds II. This new version features a simplified architecture that further accelerates inference time, along with added angular conditioning, allowing integration into the simulation pipeline. The model was tested in a realistic DD4hep based simulation model of the ILD detector concept for a future Higgs factory. This is done with the DDFastShowerML library which has been developed to allow for easy integration of generative fast simulation models into any DD4hep based detector model. With this it is possible to benchmark the performance of a generative ML model using fully reconstructed physics events by comparing them against the same events simulated with Geant4, thereby ultimately judging the fitness of the model for application in an experiment’s Monte Carlo.
Monte Carlo (MC) simulations are essential for collider experiments, enabling comparisons of experimental findings and theoretical predictions. However, these simulations are computationally demanding, and future developments, like increased event rates, are expected to exceed available computational resources. Generative modeling can substantially cut computing costs by augmenting MC simulations, thereby addressing this issue. To this end, last year, we presented ConvL2LFlows, a convolutional-flow-based generative model. This year, we present several improvements to this model, making it usable in realistic simulations. These improvements are: i) adding angular conditioning to generate showers with arbitrary incident angels ii) using nine times more bins than calorimeter readout-cells to be able to use the model for arbitrary incident points, and iii) integrating L2LFlows into the full simulation pipeline using DDFastShowerML.
TBD