Name: 6th Round Table on Deep Learning at DESY 2023
Start: 2023-12-08T09:30:00+01:00
End: 2023-12-08T15:37:00+01:00
Location: DESY

- 09:30 → 10:00
  
  Welcome 30m
- 10:00 → 11:30
  Session I
  - 10:00
    
    Welcome 9m
    
    Speaker: Philipp Heuser (DESY/Helmholtz Imaging)
    
    Intro_RT_DL_DESY_23.pptx
  - 10:09
    
    Keynote 45m
    
    Speaker: Martin Burger (FS-CI (Computational Imaging))
    
    DL Summit DESY.pdf
    
    DL Summit DESY.pptx
  - 10:54
    
    Generative models for Inverse Problems: The phase retrieval problem from a single distance intensity measurement 12m
    
    The phase retrieval problem is a non-linear, ill-posed inverse problem. It is also an important step in X-ray imaging, a precursor to the tomographic reconstruction stage. Experiments involving micro and nanometer-sized objects usually have weak absorption and contrast. This is usually the case in most experiments taking place at high-energy big Synchrotron centres like DESY. Hence, retrieving the phase information is crucial for the quality of the tomographic reconstruction. This problem exists also in other fields like astronomy, optics, and electron microscopy. Our research deals with single distance, near or holographic region intensity images, which, in the mathematical sense, are the squared modulus of a complex object that was propagated forward to a certain distance using the Fresnel operator. In this talk, we want to challenge the listeners that generative models can be powerful tools in inverse problems, especially those with clearly defined forward models. We would further show that it plays an important role in uncertainty quantification.
    
    Speaker: Dawit Hailu (Hereon (Helmholtz-Zentrum Hereon))
  - 11:06
    
    Potental Uses of (Large) Language Models for DESY 12m
    
    Large language models (LLMs) have demonstrated formidable capabilities in analyzing and synthesizing natural language text. This technology presents an opportunity to extract and connect knowledge from the expansive corpus of textual and visual artifacts accumulated over decades of research at the German Electron Synchrotron (DESY).
    This talk will detail current efforts within the DESY MCS4 to distill insights from various DESY knowledge resources to better interpret the collective wisdom generated throughout the center's extensive operational history.
    Initial investigations focus on deploying LLMs to process materials from particle accelerator conferences to model semantic relationships within this discipline. We further examine the application of basic LLM architectures for analyzing sequences of log data from control system watchdog nodes, aimed at detecting anomalies. Finally, we share efforts to develop supervised LLMs that ingest specific sources of DESY documentation to automatically generate knowledge summaries, which harness a massive potential in various domains in DESY.
    
    Speaker: Antonin Sulc (MCS (Control System))
    
    DESY_RoundTable.pdf
  - 11:18
    
    Advancing Particle Physics with Point Cloud-Based Generative Models: A Generative Machine Learning Group Perspective 12m
    
    The CMS Generative Machine Learning Group, at the DESY round table, will showcase three distinct projects, each utilizing point cloud-based generative models to advance particle physics research. The first project, "Attention to Mean Fields for Particle Cloud Generation," features an attention-based generative model that adeptly processes complex collider data represented as point clouds, demonstrating effectiveness on the JetNet150 and CaloChallenge datasets. The second project, "DeepTreeGAN," explores novel techniques for iterative up- and downscaling of point clouds, inspired by the tree-based development of particle showers. Finally, CaloPointFlow presents a generative model using normalizing flows, specifically tailored for efficient and high-fidelity generation of calorimeter showers as point clouds, providing a speedier and more efficient alternative to conventional simulations. Together, these projects underscore the transformative role of point cloud-based generative models in the realm of particle physics.
    
    Speaker: Simon Patrik Schnake (CMS (CMS Fachgruppe Searches))
    
    CMS_Generative_Group.pdf
- 11:30 → 12:00
  
  Coffee 30m
- 12:00 → 13:00
  Session II
  - 12:00
    
    Machine Learning for Plaque Assays 12m
    
    The evaluation of plaque assays is a crucial step when studying viruses, as they are used to determine viral reproduction. This is done via a dilution series of the virus, which is applied to gel plates containing a confluent layer of host cells. Infected cells are killed by the virus and the number of empty patches ("plaques") will therefore indicate the viral load of the original sample.
    
    Counting these plaques however is not trivial, and today evaluating these assays often remains a manual task. To speed up this task, we develop an approach combining semantic and instance segmentation, to classify plaques which have merged, and thus can not be reliably counted, and plaques arising from a single virus. Merged plaques are identified via a class segmentation, whereas single plaques get separated by an instance segmentation. We are additionally establishing an iterative annotation strategy, where preliminary predictions corrected by the lab experts are used to re-train new models, to quickly increase the train set size in order to achieve a broadly applicable model in the end.
    
    By combining instance and class segmentation, we were able to better reflect the reality of the dataset, and this approach might be also applicable to other real world datasets. In summary, generating a feedback loop with preliminary manually corrected annotations enables a quick increase in training data.
    
    Speaker: Jennifer Ahrens (IT (Research and Innovation in Scientific Co))
    
    Vortrag_Plaques.odp
  - 12:12
    
    Ahead-of-time (AOT) compilation of Tensorflow models 12m
    
    In a wide range of high-energy particle physics analyses, machine learning methods have proven as powerful tools to enhance analysis sensitivity.
    In the past years, various machine learning applications were also integrated in central CMS workflows, leading to great improvements in reconstruction and object identification efficiencies.
    
    However, the continuation of successful deployments might be limited in the future due to memory and processing time constraints of more advanced models evaluated on central infrastructure.
    
    A novel inference approach for models trained with TensorFlow, based on Ahead-of-time (AOT) compilation is presented. This approach offers a substantial reduction in memory footprints while preserving or even improving computational performance.
    
    This talk outlines strategies and limitations of this novel approach, and presents integration workflow for deploying AOT models in production.
    
    Speaker: Bogdan Wiederspan (UNI/EXP (Uni Hamburg, Institut fur Experimentalphysik))
    
    2023-10-05_Roundtable_Of_DeepLearning.pdf
  - 12:24
    
    Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information 12m
    
    We introduce a method for efficiently generating jets in the
    field of High Energy Physics. Our model is designed to generate ten different
    types of jets, expanding the versatility of jet generation techniques. Beyond
    the kinematic features of the jet constituents, our model also excels in
    generating informative features that provide insight into the types of jet
    constituents, such as features that indicate if a constituent is an electron
    or a photon, offering a more comprehensive understanding of the generated
    jets. Furthermore, our model incorporates valuable impact parameter
    information, enhancing its potential utility in high-energy physics research.
    
    Speaker: Joschka Valentin Maria Birk (None)
    
    DESY_DeepLearningRoundTable_JoschkaBirk.pdf
  - 12:36
    
    A resilient feature extractor in SAXS images 12m
    
    Research on the application of small-angle x-ray scattering (SAXS) method, using x-ray free-electron laser (XFEL) images, utilizes normalizing flows for the inversion of experimental X-ray scattering images. One of the main challenges lies in the inversion of such experimental scattering images, which contain various artifacts such as parasitic scattering, slit scattering, beamstop, and detector background. These artifacts pose a significant domain shift for the neural network used in the inversion process. Parasitic scattering typically appears as a Gaussian-shaped cluster around the primary beam, accompanied by scattered photons in the vicinity. Slit scattering manifests as streaks around the primary beam, while the beamstop obstructs the main beam entirely, resulting in a lack of signal. The detector background refers to an offset with some underlying structure. Currently, the simulated dataset is being modified to incorporate these artifacts. However, this approach may not be sustainable in the future, as the exact characteristics of the artifacts are unknown in advance, and there is limited time to model them during the experiment. Hence, the intention is to collaborate in developing a resilient feature extractor capable of extracting features from both simulated and experimental data, even in the presence of unknown artifacts. These extracted features will subsequently be utilized for the inference process in the primary neural network responsible for inversion.
    
    We aim to address the problem by using contemporary deep learning techniques. At present, we are exploring two potential approaches: one involves learning representation through $\beta$-VAE, while the other entails utilizing image-to-image translation methods such as CycleGAN and pix2pix.
    
    Speaker: Engin Eren (IT (Research and Innovation in Scientific Co))
    
    6thRoundTable_SAXS.pdf
  - 12:48
    
    Fault detection in Ion Pumps at the European XFEL 12m
    
    Large-scale scientific facilities like European XFEL are complex and include multiple subsystems that work in coordination to generate high-quality scientific output. Any fault within such a subsystem can result into downtime for the entire facility, with a significant impact in the scientific output. It is therefore, fundamental to detect problems or unexpected behaviour in components well in advance, allowing for timely interventions and efficient maintenance planning, instead of unplanned activities which stress support personnel. However, monitoring a large number of process variables is often too complex for human observation, which struggle to discern hidden anomalies. The usage of machine learning and data-driven techniques can be instrumental in extracting invaluable insights. We present a case-study, in which faulty patterns in ion pump pressure data are identified. We explore Support Vector Machines (SVM) and Convolutional Neural Networks (CNN) to classify data obtained from multiple ion pumps installed at European XFEL.
    
    Speaker: Amna Majid (Eur.XFEL (European XFEL))
    
    talk-Amna_Majid.pdf
- 13:00 → 14:00
  
  Lunch 1h
- 14:00 → 15:37
  Session III
  - 14:00
    
    HIFIS services for Deep Learning 12m
    
    A brief overview of a selection of HIFIS services that can be useful for Deep Learning applications, like the GPU Compute Service.
    
    HIFIS provides and brokers digital services for everyone in Helmholtz and collaboration partners.
    
    Speaker: Sophie Servan (DESY)
    
    20231208_HIFIS-DeepLearning.pdf
  - 14:12
    
    Immersive Neural Graphics for Robot Teleoperation and Remote Inspection at Complex Physics Facilities 12m
    
    Large industrial facilities are complex systems that not only require regular maintenance and upgrades but are often inaccessible to humans due to various safety hazards. Therefore, a virtual reality (VR) system that can quickly replicate real-world remote environments to provide users with a high level of spatial and situational awareness is crucial for facility maintenance planning and robot teleoperation. However, the exact 3D shapes of these facilities are often too complex to be accurately modeled with geometric meshes through the traditional rasterization pipeline.
    In this work, we present how neural graphics primitives such as neural radiance fields (NeRF) and 3D Gaussian radiance fields (3DGS) can be used to rapidly create one-to-one replications of complex physics facilities at advanced optics laboratories and particle accelerators. We demonstrate how users can interact with such photorealistic virtual environments in immersive mixed and virtual reality with sample applications in robot teleoperation and facility inspection and planning.
    
    Speaker: Ke Li (MCS (Control System))
    
    immersive_neural_graphics_particle_accelerators.pptx
  - 14:24
    
    Using Deep Learning for interpreting macromolecular crystallography and electron cryomicroscopy data 12m
    
    I will present applications of Deep Learning-based protein structure prediction tools, such as AlphaFold, for interpreting experimental data in macromolecular crystallography and electron cryomicroscopy. I will also show examples of my own Deep Learning tools trained to complement the use of predicted models for macromolecular structure determination.
    
    (1) Chojnowski NAR 2023 51(15) 8255–8269 (10.1093/nar/gkad553)
    (2) Chojnowski ActaD 2023 79(7) 559–568 (10.1107/S2059798323003765)
    (3) Chojnowski ActaD 2022 78(7) 806-816 (10.1107/S2059798322005009)
    (4) Chojnowski IUCrJ 2022 9(1) 86-97 (10.1107/S2052252521011088)
    (5) Skalidis et al Structure 2022 30(4) 575-589 (10.1016/j.str.2022.01.001)
    
    Speaker: Grzegorz Chojnowski (EMBL)
  - 14:36
    
    Real-time ML event classification with FPGAs at the LHC 12m
    
    Particle colliders such as the LHC produce data at an unprecedented rate and volume. To overcome bandwidth constraints, event filtering systems are employed, with the first stage usually implemented in hardware using FPGAs. We present the first hardware demonstration of a real-time event filtering algorithm using machine learning for the Level-1 Trigger of the CMS experiment.
    
    Speakers: Artur Lobanov (Universität Hamburg), Finn Jonathan Labe (Universität Hamburg)
    
    231208_L1project_DESYUHH_ML.pdf
  - 14:48
    
    Closing the loop: Online reflectivity fits using neural networks integrated into beamline environments 12m
    
    We present a case study of machine learning (ML) integrated into beamline control to drive autonomous x-ray reflectivity (XRR) measurements [1], which can be seen as a prototypical implementation to serve as an example for other in-situ and in-operando synchrotron and neutron experiments. ML strategies have significantly improved in the analysis of reflectometry data in recent years [2], however, there have been limitations in the robust handling of complex scenarios, that might require additional knowledge about the sample for successful XRR fitting. This work addresses these challenges by enabling the use of prior knowledge during the ML fit. During the growth of organic molecular thin films, we established a closed loop between real-time, ML-based online data analysis and the sample environment to tailor the deposition process of organic thin films on a molecular monolayer level.
    
    [1] J. Synchrotron Rad. 30, 1064-1075 (2023), Pithan et al.
    [2] J. Appl. Cryst. 56, 3-11 (2023), Hinderhofer et al.
    
    Speaker: Linus Pithan (FS-EC (Experimente Control))
    
    closing_the_loop_round_table.pptx
  - 15:00
    
    CaloClouds I, II: Fast Geometry-Independent Highly-Granular Calorimeter Simulation 12m
    
    Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics.
    Achieving high accuracy and speed with generative machine learning models would enable them to augment traditional simulations and alleviate a significant computing constraint.
    This contribution marks a significant breakthrough in this task by directly generating a point cloud of O(1000) space points with energy depositions in the detector in 3D-space. Importantly, it achieves this without relying on the structure of the detector layers. This capability enables the generation of showers with arbitrary incident particle positions and accommodates varying sensor shapes and layouts. Two key innovations make this possible: i) leveraging recent improvements in generative modeling, we apply a diffusion model to ii) an initially even higher-resolution point cloud of up to 40,000 GEANT4 steps. These steps are subsequently down-sampled to the desired number of up to 6000 space points. We demonstrate the performance of this approach by simulating photon showers in the planned electromagnetic calorimeter of the International Large Detector (ILD), achieving overall good modeling of physically relevant distributions.
    
    Speaker: Anatolii Korol (FTX (FTX Fachgruppe SFT))
    
    08_12_2023_6th_Round_Table_DESY_ML_Anatolii_Korol.pdf
  - 15:12
    
    Model-agnostic search for dijet resonances with anomalous jet substructure with the CMS detector 12m
    
    We present a model-agnostic search for new physics in the dijet final state using five different novel machine-learning techniques. Other than the requirement of a narrow dijet resonance, minimal additional assumptions are placed on the signal hypothesis. Signal regions are obtained utilizing multivariate machine learning methods to select jets with anomalous substructure. A collection of complimentary methodologies -- based on unsupervised, weakly-supervised and semi-supervised paradigms -- are used in order to maximize the sensitivity to unknown New Physics signatures.
    
    Speaker: Louis Moureaux (UNI/EXP (Uni Hamburg, Institut fur Experimentalphysik))
    
    Moureaux_DESY-ML-Roundtable.pdf

6th Round Table on Deep Learning at DESY 2023

Flash Seminar Room

DESY