ErUM-Data Community Information Exchange

Europe/Berlin
Online

Online

Andreas Haungs (Karlsruhe Institute of Technology - KIT), Bridget Murphy (Kiel University), Christian Gutt (Universität Siegen), Erik Bruendermann (KIT), Kilian Schwarz (GSI), Markus Schumacher (Albert-Ludwigs-Universität Freiburg), Martin Erdmann (RWTH Aachen University)
Description

From Big Data to Smart Data: Digitalisation in basic scientific research

Welcome

You can browse your fellow researcher's interests and skills via the Timetable to match up for joint research. If you have a question, please visit the FAQ. If you do not find a satisfactory answer, please contact the chairs of this indico-site.

We look forward to your contributions to the Information Exchange and wish you every success in building research teams as well as an interesting reading of other contributions. The Information Exchange lives from your Abstracts, so please do not hesitate to submit.

For posting your own information please follow the four steps:

  1. read the "Data Privacy Policy",
  2. create an "Indico Account" and,
  3. to comply with General Data Protection Regulation, "Register".
  4. You are then invited to post your interests via the "Call for Abstracts".

For your information:

  • We use Indico for our exchange as it provides many useful features.
  • You can withdraw your registration and your abstract at any time.
  • Please consider this Indico-site here as a long-term community event and disregard specific dates or times in the timetable as we use indico not for a specific conference.
Registration
Your registration to the ErUM-Data Community Information Exchange
Participants
  • Alexander Schoekel
  • Andrea Thorn
  • Andreas Döpp
  • Andreas Haungs
  • André Hilger
  • Anton Barty
  • Arno Straessner
  • Astrid Hoelzing
  • Barbara Jäger
  • Bastian Pfau
  • Bridget Murphy
  • Britta Höpfner
  • Catalina Jimenez
  • Christopher Wiebusch
  • David Meier
  • Dirk Heinen
  • Dirk Lützenkirchen-Hecht
  • Dmitry Malyshev
  • Erik Bründermann
  • Gordian Edenhofer
  • Gregor Kasieczka
  • Günter Quast
  • Günther Dollinger
  • Hans-Georg Steinrück
  • Henrike Müller-Werkmeister
  • Jajnabalkya Guhathakurta
  • Jakob Roth
  • Jan-Dierk Grunwaldt
  • Jens-Uwe Hoffmann
  • Joachim Wuttke
  • Johan Messchendorp
  • Judith Reindl
  • Kai Zhou
  • Karan Molaverdikhani
  • Kilian Schwarz
  • Lars Lühl
  • Leila Noohinejad
  • Luis Vera Ramirez
  • Marcel Kunze
  • Margarita Russina
  • Marina Ganeva
  • Markus Elsing
  • Markus Köhli
  • Markus Osterhoff
  • Martin Erdmann
  • Martin Landesberger
  • Michael Kramer
  • Michael Lupberger
  • Michele Caselle
  • Miriam Fritsch
  • Neelima Paul
  • Olaf Magnussen
  • Patrick Reichart
  • Rahul Singh
  • Ralf-Jürgen Dettmar
  • Roger Wolf
  • Sascha Eichstädt
  • Sebastian Busch
  • Sebastian Degener
  • Shahab Sanjari
  • Sonal Ramesh Patel
  • Stefan Wagner
  • Stefano Pasini
  • Tamara Husch
  • Thomas Bretz
  • Thomas Kuhr
  • Thomas Schörner-Sadenius
  • Tim Ruhe
  • Tobias Richter
  • Tobias Stockmanns
  • Torben Ferber
  • Ulrich Husemann
  • Uwe Klemradt
  • Wolfram Helml
  • Wednesday, 1 January
    • 09:00 09:05
      Teilchen- und Astroteilchenphysik: data-driven knowledge acquisition 5m
      • Most burning research question: Application of Information Field Theory to current issues in signal processing.
      • would like to improve your knowledge: Information Field Theory.
      • expertise: Deep Learning.
      • data handling: Parallel processing of event-by-event operations for fast turnarounds, scientific user interface for modern work in physics via web platform.
      • kind of data: event-by-event data -> Algorithm development, especially deep learning methods and their application.
      • expertise in computing and / or software development: Long term experience in large software projects with 3-12 developers, tracking systems, feedback etc. CRPropa simulation program, VISPA-cloud service data analysis.
      • field and role: University professor in experimental physics, particle and astroparticle physics.
      Speaker: Martin Erdmann (RWTH Aachen University)
    • 09:05 09:10
      Large Scale and FAIR Data Management / Neuroevolution 5m

      most burning research questions:
      - Building a Data Lake for large scale and FAIR data management for FAIR
      - automatise the search for an optimal neural architecture by using Evolutionary algorithms, e.g via Neuroevolution
      area to improve knowledge skills: FAIR data management
      area to contribute: Big Data Management/Distributed Optimisation
      data handling teaching: XRootD based data management solutions
      data to deal with: experiment and processed data of ALICE and FAIR
      expertise in computing: Big Data Management (Data Lakes, Dynamic Data Caches, high speed data transfer) & Distributed Optimisation
      field and role: GSI IT/head of distributed computing, KHuK representative for Data and Information
      ErUM-Comittee: KHuK

      Speaker: Kilian Schwarz (GSI)
    • 09:10 09:15
      Insights into physics by AI & ML analysis and visualisation of complex data 5m
      • most burning question: how to easily access and run a plethora of AI & ML algorithms on the same data and benchmark it to known standards in statistics and models in physics.
      • like to gain new knowledge/insights via visualisation of condensed, complex, partly sparse, partly correlated, multidimensional data.
      • expertise: data analysis and visualization.
      • data handling: good practice in large-scale facility and with IT experts challenges in research data management
      • kind of data: multi-dimensional incl. timeseries and multi-modal data. Big data.
      • expertise computing / software development: long-term experience in programming and data analysis
      • field: physics (incl. mathematics)
      • role: head of department (accelerator R&D and operations). Dept. with IT group servicing large-scale facility and AI&ML team active in various use cases such as autonomous accelerator
      Speaker: Erik Bründermann (KIT)
    • 09:15 09:20
      ML methods for the solution of biological macromolecules 5m

      My group and I are interested in using ML methods, mainly convolutional neuronal networks, for structure solution in macromolecular crystallography and single-particle Cryo-EM. This involves everything from data collection, finding measurement and processing problems, processing, phasing or alignment, modelling/interpretation and validation.
      So far, we have released a neural network to annotate secondary structure and RNA/DNA in Cryo-EM maps called HARUSPEX and are working to find problematic reflection data in processed and merged diffraction data sets with ML.

      Speaker: Andrea Thorn (DESY (Deutsches Elektronen Synchrotron))
    • 09:20 09:25
      Artificial intelligence & machine learning for data monitoring and data analysis for reflectivity data, coherent X-ray diffraction and for future XPCS applications at liquid interfaces 5m

      I am working at synchrotrons and FELs, In this programme I am interested in developing hard and software solutions for detecting beam damage, to develop automated data collection and reduction solutions and in particular in online smart data analysis, online and post processing for reflectivity, pump - probe investigations, coherent X-ray diffraction, nano scanning X-ray diffraction.
      My main scientific fields:
      • Non-equilibrium dynamics of liquids. Pump/probe investigation of liquid interfaces including solvation processes and charge and ion transfer using synchrotron and Free Electron Laser sources. Strongly correlated systems (liquid metals) and bio-molecular films (lipid membranes).
      • Physics at surfaces and interfaces: Investigating the dynamics and structure of solids and liquids with X-ray diffraction, X-ray reflectivity, grazing incident X-ray diffraction (GID) and diffuse X-ray investigations on liquid/liquid interfaces. In situ studies of electrochemically controlled liquid.
      • Investigating magnetically induced mechanical strain in microstructures with nano-focus X-ray scattering and strain coupling at magnetostrictive buried interfaces via XRD, GID, Scanning nanoXRD, Coherent X-ray scattering.

      Speaker: Dr Bridget Murphy (Kiel University)
    • 09:25 09:30
      Targeting and Tracking biological cells in microscopy time lapse videos be deep learnng algorithms 5m

      Targeting a large number of biological cells or even substructures of cells at an ion microbeam requires a robust detection algorithm that is able to differentiate and track cells in their various states from low contrast micrographs. A similar requirement exists for an automated, individual cell identification, characterisation and tracking in order to follow up cellular reactions after irradiation by ionizing radiation or after treatment by any other cell manipulating treatment. It will allow new qualities of research in radiobiology and other fields where the reaction of cells on hazard or any kind of manipulating treatment is studied on the cell level.
      We propose to develop and use deep learning algortihms for cell classification, identification and cell tracking in time lapse micrographs. First results on the way to automated cell classification and identification by faster RCNN algorithms have already been obtained (S. Rudigkeit et al, submitted).

      Speakers: Günther Dollinger (Universität der Bundeswehr München), Prof. Judith Reindl (Universität der Bundeswehr München)
    • 09:30 09:35
      Artificial intelligence and machine learning tools for time-reolved and tomographic data in materials science and catalysis 5m

      In functional materials like catalysts we deal with a very high amount of correlated data analysed by spectroscopy/microscopy/scatteringin combination with function (e.g. infrared/gaschromatography/mass spectroscopy data). Hence, this results in not only 3D but also 4D- and 5D-data sets. The challenge are high amount of data, where we seek for collaborations. In addition, machine learning tools could be particularly interesting when several species are present.
      In fact, we are interested in correlating tomographic X-ray diffraction with further techniques including operando studies. This would result in new insights in our field and improved catalyst design - materials that are used for > 90% of all chemical products applied around the globe.

      Speaker: Jan-Dierk Grunwaldt (Karlsruhe Institute of Technology)
    • 09:35 09:40
      Development and optimization of algorithms with modern technologies to maximize the physics output extracted from data 5m

      The goal of my research is to study fundamental particle and interactions with high precision to gain knowledge about the underlying principles. The data for this research is recorded by the Belle II experiment at the electron-positron collider SuperKEKB.

      Algorithms and modern technologies, such as machine learning, play a crucial role in optimally using the available data. The ability to implement complex models allows to use them for the generation of increasingly realistic simulations. An interesting research question is if generalized solutions can be found or to what extend problem specific solutions have to be developed.

      Speaker: Thomas Kuhr (BELLE (BELLE II Experiment))
    • 09:40 09:45
      Data reduction and data analysis for neutron and x-ray scattering 5m

      The Scientific Computing Group at Heinz Maier-Leibnitz Zentrum Garching develops and maintains data treatment software for neutron scattering.

      Our main projects are currently:

      • BornAgain, software to simulate and fit reflectometry, off-specular scattering and grazing-incidence small-angle scattering;
      • Data reduction software for single-crystal diffraction, not yet operational;
      • Steca, the strain and texture calculator: data reduction for materials diffraction;
      • data analsyis for high-resolution spectroscopy, early planning stage.

      We are interested in collaborations to improve any of the above.

      And we are available to support other software projects that may be used to treat neutron data from our facility.

      Speaker: Joachim Wuttke (JCNS)
    • 09:45 09:50
      Use of maschine learning to classify/interpret diffuse scattering 5m

      Diffraction pattern of disordered crystal are characterized by a rich presence of diffuse scattering between the Bragg reflections. No unified theory for the analysis and interpretation exist that would be comparable to direct structure determination techniques as applied to bragg reflection intensities.
      The intention is to build a toolbox using maschine learning and deep learning techniques to :
      - classify the distribution of diffuse scattering like but not exclusive (single maxima; linear rods; layers; curved features; etc)
      - asses the building principles using aspects of molecular form factors etc

      Speaker: Reinhard Neder (Friedrich-Alexander-Universität Erlangen-Nürnberg)
    • 09:50 09:55
      ML on FPGAs for future DAQs 5m
      • Future LHC and other experiments will procude more data than can be handled with current technology (spatial resolution, additional precise timing information, increased sensor size =>larger output bandwidth of new generation frontend chips)
      • Same issue in many sectors of information driven society: Data explosion
      • One solution: shift methods currently used in online and offline data processing to an earlier stage in DAQ chain => smart data: transmitting data properties as e.g. cluster or track parameters
      • Currently: dedicated feature extraction algorithms (e.g. hough transformation) implemented on FPGAs, ML methods recently applied in triggering
      • Problem: application of more advanced ML methods to reconstruct feature properties as they are used in online computer farms on CPUs and GPUs so far hindered by a lack of FPGA resources
      • NEW: FPGA vendors are currently including dedicated AI cores in addition to FPGA and CPU resources (System on Chip)

      => This project: evaluate CPU+FPGA+AI devices (cross-disciplinary interest)

      • Tansfere existing ML methods of computer science to hardware (high level synthesis tools & possibly more efficient implementation in hardware description language)
      • Use data from latest frontend ASICs in high rate experiments R&D for qualification
      • Additional topics:
        -- reduction of power consumption for computing
        --sociological: data protection by requested feature extraction within the acquisition
        --philosophical: trustworthy AI
      Speaker: Michael Lupberger (University of Bonn)
    • 09:55 10:00
      Advanced On- and Off-line reconstruction methods in Neutrino Astronomy 5m

      The IceCube Neutrino Observatory measures cosmic neutrinos by detecting Cherenkov-light from neutrino interactions using optical sensors embedded in the Antarctic Ice. Main challenge is related to the sparseness of locally complex information and the systematic uncertainties related to the propagation of optical photons through the natural ice medium and its properties. Machine-learning techniques have already been applied very successfully to IceCube and my interest is focused on two unsolved questions that seem ideally suited for ML applications.
      1.) Inclusion of systematic uncertainties for real-time alerts that are sent to the astrophysical community. Critical is the proper estimation of the event-direction uncertainty. Currently, alerts do not include systematic uncertainties. Those vary with the individual event topology and can only be estimated by a computational intense analysis. Goal is a better estimation of uncertainty predictions by DNNs.
      2.) Reconstruction for next generation multiple PMT sensors.
      Next generation instruments will consist of optical sensors with multiple
      photomultipliers integrated into a single sensor unit providing a fly's-eye type of local light measurements. Traditional Likelihood approaches have so far failed in benefiting from the full information. Goal of this project is developing a new reconstruction that is A) extracting relevant features from the single sensor images. B) improved track reconstruction by combing these features.

      Speaker: Christopher Wiebusch (RWTH Aachen)
    • 10:00 10:05
      NN decisions that are robust against systematic uncertainties in the NN feature space 5m

      We are interested in neural network (NN) applications with usually not more than a few hundred input features and large (usually) synthetic training samples. The features are subject to systematic uncertainties, which have to be accounted for in the NN decision taking. In preliminary publications we have investigated the following areas of interest:

      • What input features to an NN are most influential for its decision
        taking [1]?
      • Methods to make the NN aware of systematic uncertainties
        during training, via the loss function [2].
      • Methods to move closer with the NN training objective to the full likelihood,
        including all systematic uncertainties, as the actual objective of the measurement [3].

      Our studies are based on a PhD thesis [4] and three Master's theses [5]. We foresee to establish the full subject of our research proposal with an application for future measurements of differential cross sections for Higgs boson production at the LHC in mind, which can serve as demonstration of the usefulness and importance of this research for any field where high credibility and tight monitoring of NN decision taking are desirable.

      [1] https://arxiv.org/abs/1803.08782
      [2] https://arxiv.org/abs/1907.11674
      [3] https://arxiv.org/abs/2003.07186
      [4] https://cds.cern.ch/record/2751100?ln=de
      [5] https://publish.etp.kit.edu/record/21436, https://publish.etp.kit.edu/record/21950, https://publish.etp.kit.edu/record/22042

      Speaker: Roger Wolf (KIT - Karlsruhe Institute of Technology (DE))
    • 10:05 10:10
      Identifying and characterizing spectral lines 5m

      Our specific interest is the analysis of both laboratory and astronomical molecular spectra using ML methods. The speed and quality of data gathered today, both in the laboratory and by astronomical instruments (e.g. ALMA) demand new and faster methods of both generating molecular line catalog entries and of analyzing astronomical data cubes. For the latter, there have been ongoing efforts in the framework of the German ALMA ARC (funded through ErUM Pro), but this funding line is not really suitable, and a new effort to make use of efficient ML methods is necessary. This could of course be extended to other wavelength ranges, since the principles are similar.

      Speaker: Prof. Peter Schilke (University of Cologne)
    • 10:10 10:15
      Ray-tracing and wave propagation 5m

      Ray-tracing algorithms are used in various contexts. This can be the simulation of light-guides in a Cherenkov telescope, the light-propagation in ice or water, or complex optical systems as in Gravitational Wave detectors. Ray-tracing (and shading) is also the fundamental method for many industrial 3D application, ranging from CAD design over virtual reality to game development. Beyond step-wise ray-tracing (like in MC simulations for photon propagation) mostly expensive commercial tools with no major flexibility are available and thus are difficult to include in a typical scientific workflow. On the other hand, many existing solutions lack the required performance to simulate applications with numerous elements. For complex physics simulations (e.g. interference, diffraction, resonators), no generalized scientific application for a broad community exists. AI algorithms promise to solve these problems with much higher performance than classical methods and become easily scalable. To reduce computing time, a major challenge is an early decision which rays should be tracked further and which rays should be discarded. An ideal application for machine learning algorithms.

      Speaker: Thomas Bretz (RWTH Aachen University)
    • 10:15 10:20
      Adaptive simulation for components of a synchrotron 5m

      Active Experimentation
      Adaptive Simulations on X-ray diffraction data (detect distribution noise & distribution of data)

      Speaker: David Meier (Helmholtz-Zentrum Berlin)
    • 10:20 10:25
      ML on FPGAs for real-time processing of detector data 5m
      • Calorimeter data at current LHC experiments (e.g. ATLAS) require real-time energy reconstruction
      • Signal pile-up (in-time and out-of-time) is a challenge
      • ML approaches, like artificial neural networks (ANN), look promising
      • ANN training/application needs to be resource efficient for FPGA implementation
      • ANN training/application needs to be aware of bit precision for FPGA implementation
      • VHDL, HLS and general tools shall be used/further developed to achieve the goal
      Speaker: Arno Straessner (IKTP, TU Dresden)
    • 10:25 10:30
      Rapid online data evaluation for X-ray experiments 5m

      With the advent of high-speed multi-megapixel detectors, previously unimaginable in situ time-resolved and high throughput experiments using synchrotron and free-electron laser X-ray sources have become possible. Such detectors generate massive quantities of data are far in excess of what human experimenters can manually interpret and analyse rendering old school data analysis methods of ‘collect and take home to analyse later’ obsolete. Analysing data sufficiently quickly to keep the experimenter in the loop and able to make agile decisions on how to proceed with the experiment at the pace experiments can be performed is now a major challenge. We would like to exploit AI-based data analysis to guide decisions during the course of an experiment, as well as in subsequent data selection. The key is to deliver preliminary results at high speed to users in a timely manner, enabling experimenters to focus on the best data and to analyse much larger data volumes, and thereby increasing the scientific output of our large scale facilities.

      Speaker: Anton Barty (FS-SC (Scientific computing))
    • 10:30 10:35
      Dealing with Systematic Uncertainties in Machine Learning 5m

      Data analysis in particle physics today is characterized by large datasets of independent “events”. ML methods are ubiquitous in all analysis steps from physics object selection to classification of events into signal and background categories. The parameters of the underlying physics models are extracted using maximum-likelihood methods, using detailed simulated (synthetic) datasets.

      With even larger datasets expected for the next decade, the accuracy of many physics results will be limited by systematic uncertainties. Known systematic uncertainties are initially estimated by auxiliary measurements or variations of simulation parameters. They are then translated as nuisance parameters of the physics model and further constrained in the fit to data. However, the modeling of data in simulation may be imperfect, possibly leading to unwanted biases in the physics result. On the other hand, differences between data and simulation may also be due to new-physics effects.
      Our group has explored several approaches to reduce and/or quantify the impact of systematic uncertainties on ML methods in the context of Higgs-boson studies at the CERN Large Hadron Collider. The studies, documented in several undergraduate theses, include adversarial networks, Bayesian neural networks, and Domain Adaptation,

      Speaker: Ulrich Husemann (Institute of Experimental Particle Physics - Karlsruhe Institute of Technology - KIT)
    • 10:35 10:40
      Reinforcement Learning for the Automation and Improvement of Large-Scale Facilities Operation 5m

      Reinforcement Learning (RL) is one of the central and most interesting topics in the current ML landscape. In particular, at the light source BESSY II, the development of Deep RL agents for performance improvement is a field of interest since 2019 and is been already carried out in several use cases at the machine (booster current optimisation, injection efficiency, mitigation of harmonic orbit perturbations...) with promising results. Besides, interpretability methods applied to the trained agents are giving interesting insights about the learning process and the machine itself. In the context of an eventual ErUM proposal we would like to share experiences with further use cases in other facilities, improve the natural challenges associated with RL-loops (learning stability, hardware issues...) and expand our RL/ML-toolset with additional methods.

      Speaker: Luis Vera Ramirez (Helmholtz-Zentrum Berlin)
    • 10:40 10:45
      Information field theory 5m

      Information field theory (IFT) is a Bayesian framework for signal reconstruction and analysis, which builds on the mathematics of statistical field theory and machine learning (ML) inference schemes. Its practical usage is supported by Numerical Information Field Theory (NIFTy, git), a differentiable probabilistic programming library in Python. NIFTy permits to implement signal inference methods via generative models for the data. The fields, which are to be inferred, can live over multi-dimensional Euclidean spaces, the sphere, or even product spaces thereof. IFT and NIFTy have already been applied to a number of astronomical and astroparticle instruments and are suitable for many instruments of the ErUM-Data call. Here, we propose

      1. to interface a number of ErUM-Data instruments to NIFTy by implementing digital twins of them, in order to permit IFT signal reconstruction for their users,
      2. to advance the NIFTy algorithmics for faster performance, e.g. by parallelisation, GPU usage and others means,
      3. to interface NIFTy to other ML frameworks for interoperability, speed gain, and usage of deep neural networks as priors.

      Interested parties, which like to engage in any of these topics, are welcome and encouraged to contact Torsten Enßlin <ensslin@mpa-garching.mpg.de>.

      Speaker: Torsten Ensslin (MPA)
    • 10:45 10:50
      Real-time displaced vertex reconstruction on FPGAs 5m

      Feebly coupled long-lived mediators that will decay back to visible particles and missing energy are one of the smoking-gun signatures for DM at colliders. The Belle II experiment in Japan is ideally suited to search for such events. We plan to develop real time algorithms using AI to identify non-pointing vertices in the presence of very high beam-induced backgrounds.

      We plan to develop resource-efficient algorithms for high background scenarios aimed at online data taking. Our focus will be on establishing a methodology towards efficient, scalable, and optimized translation of AI-based models for popular hardware platforms used in this field such as FPGAs. By combining modern AI SW libraries (e.g. PyTorch) and modern hardware development flows such as High-Level Synthesis we aim to abstract from hardware details while keeping key figures of merit such as resource efficiency, latency and algorithm performance comparable to manual implementations. This will enable key advances towards easier maintenance, adaptability, faster exploration for algorithms to be used, and validation before and after integration. Based on the distinct characteristics present at modern trigger and detector data flows, such as at Belle II, we will develop a reference processing system showcasing the viability of our approach. Derived from this, guidelines and methods for technology transfer towards similar applications from physics and industry will be developed.

      Speaker: Torben Ferber (KIT (ETP))
    • 10:50 10:55
      Machine Learning based Charged Particle Tracking 5m

      One of the most challenging reconstruction processes of future detector systems in the field of nuclear, hadron, particle and accelerator physics is the real-time identification of detector hits belonging to the same track of a charged particle in an environment of thousands of tracks, low momentum particles. This task has become an integral part of the online data reconstruction process, therefore demanding very high speeds to reconstruct the tracks.
      Additionally usually not only one detector system measures the flight path of a particle but different ones which have to be combined. Often these detector systems do not return three dimensional space points like pixel detectors but different kinds of information like TPCs with x-, y-position and drift time or straw tube detectors with isochrone rings. Furthermore the event topology might be very different if not only tracks from a primary interaction point are taken into account but also tracks from displaced vertices, curling tracks, tracks with a kink, inside varying magnetic fields or high radiation length environments. For all these cases classical track finding algorithms have a hard time finding these tracks.
      As a classical pattern recognition problem machine learning techniques like language models, LSTMs or graph neural networks should be the ideal tool to tackle those problems and improve both the reconstruction efficiency as well as the reconstruction speed.

      Speaker: Tobias Stockmann (FZJ)
    • 10:55 11:00
      Next generation v[f]ast data monitoring and control 5m

      Today’s and tomorrow’s detector systems in the field of nuclear, hadron, particle, and accelerator physics are facing a growth in complexity due to the exploding number of sensor elements and the large amount and variety of information that is being generated. The next generation of experiments foresee to in-situ reconstruct the complete event topologies and to extract high-level (physics) information in an environment with unprecedented interaction rates and/or background sources. Considering the complexity of information and the limitations in scalability of existing data processing schemes, it is necessary to incorporate an intelligent monitoring system with automated sensor calibration to guarantee a stable running mode. Such a shift in paradigm can be realized by applying, on one hand, machine learning techniques optimized for anomaly detection and, on the other hand, a feedback system that couples back to the sensor parameters in the case of a detector-related problem or to initiate further processing of events with intriguing topologies. We propose to form a consortium that evaluates various machine learning techniques, including (un)supervised and reinforcement methods, to tackle this challenge within the broad field of accelerator-driven applications. This research will be guided by taking into account the interpretability of the system to provide sufficient understanding and confidence of its operation.

      Speaker: Johan Messchendorp
    • 11:00 11:05
      Artificial neural networks and deep learning methods for use in storage ring experiments with highly charged ions 5m

      Heavy ion storage rings offer a unique possibility to investigate atomic and nuclear properties of highly charged ions. GSI/FAIR accelerator facility is home to several such storage rings for highly charged ion research, such as the currently active experimental storage ring (ESR) and the CRYRING and the planned high energy storage ring (HESR) and the collector ring (CR). In such experiments, usually several as well as different types of detectors are used. As a result, huge amounts of data are generated that need to be analyzed using complex methods, both during the experiment (online) and afterwards (offline). As an example one can mention characteristic spectral lines in atomic physics experiments or nuclear mass and lifetime measurement scenarios.

      A precise online analysis is of prominent significance due to the strict timing requirements imposed either by the physics (short lived states) or by the machine. Decisions based on a quick but exact identification of the beam components during the data taking can be used for manual or automated change of the parameters of the experiment and the machine in order to optimize the focus on the regions of interest. The above procedures can greatly profit from deep learning and artificial intelligence ANN/DNN methods. During the offline analysis, the scope of the application of machine learning algorithms can be extended not only to finalize the results but to examine farther reaching correlation inside larger amounts of data.

      Speaker: Shahab Sanjari (Aachen University of Applied Sciences + GSI Darmstadt)
    • 11:05 11:10
      Speed improvements of MC Event Generators and Detector Simulations using Deep Neural Networks 5m

      I am co-author of the DYTURBO event generator, predicting the differential cross-sections of vector boson production at the LHC. In order to improve the speed during the phase-space integration process, I would like to develop/include DNN-based phase-space sample algorithms.

      Furthermore I am working on DNN based unfolding techniques and would be interested to extend the unfolding aspect to folding aspects, i.e. the full detector simulation.

      Speaker: Matthias Schott (Uni Mainz)
    • 11:10 11:15
      Joint inference of calibration and signal 5m

      Data collected by instruments are influenced by detector/system response, which has to be calibrated in order to reconstruct the underlying signal. The aim of the group is to jointly infer the detector/system response and the physical signal. Examples of the signal and the corresponding detector response include: reconstruction of the CR energy, direction and composition based on air shower properties (response: atmosphere condition, telescope efficiency) and identification of nuclei in mass spectra measured in heavy ion storage rings. The detectors/system can be calibrated using the data itself or with the help of additional calibration measurements (e.g., using sensors). A number of methods will be explored to tackle the problem of joint signal and calibration inference. This includes fundamental concepts of information field theory (IFT) and quantum field theory (QFT), in particular in view of tackling parametric degeneracies, as well as the usage of deep neural networks. The different approaches will be compared and combined.

      Speakers: Dmitry Malyshev (ECAP), Torsten Ensslin (MPA), Shahab Sanjari (Aachen University of Applied Sciences + GSI Darmstadt), Philip Ruehl (University of Siegen), Björn Garbrecht (TUM)
    • 11:15 11:20
      AI-based reconstruction of spectra and images 5m

      The reconstruction of experimentally inaccessible quantities is a common challenge in many research areas. In case the conversion from sought-after quantities to experimental observables is governed by stochastical processes, an unfolding/deconvolution is required to extract the quantities of interest. Detector effects generally further enhance the smearing in the experimental observables.

      A number of algorithms for solving inverse problems exists. Classical approaches, however, lose the entire information on individual events during the unfolding process. This is not the case, if the binned version of an inverse problem is interpreted as a classification problem and accordingly solved via the application of classification algorithms (see: https://sfb876.tu-dortmund.de/deconvolution/index.html), which greatly enhances the robustness and interpretability of the algorithms.

      Although the application of machine learning-based unfolding algorithms like DSEA and DSEA+ is a story of success, some research questions still have to be addressed. One of the most pressing questions concerns the enhancement of the algorithm's performance via the consideration of neighborhood relations of individual classes (the classifier is unaware of these). Furthermore, the extension to spectral reconstruction in multiple dimensions is of large interest.

      If you are interested in this topic feel free to contact me (Tim Ruhe, tim.ruhe@tu-dortmund.de).

      Speaker: Tim Ruhe (TU Dortmund)
    • 11:20 11:25
      High resolution X ray imaging with large field of view: How to reproducibly and reliably reduce big data without loss of information and manage experiments and metadata with aim of ANN/ML 5m

      Our associated partner establish the BM 18 on ESRF in 2022 featuring fan beam and large area detector (LAD), enabling energy-dispersive and time-dependent micro tomography for research and industrial applications. Our contribution to BM 18 will include installation of LAD and corresponding software for data acquisition and reconstruction. Setup allows very large FOV with resolutions down to 25 µm, thus users will generate big data (remotely). Data acquisition of 5GB/s are expected requiring new IT infrastructure and new algorithms for data handling. Compression of 1:20 is favorable, machine learning for automated scanner optimization and adaption is targeted. Adequate acquisition, handling and analysis software for big data with aim of ANN and ML is required, to be guided by scientist.
      Fast data transfer and accelerated reconstruction by software is crucial. For automated, experience-based, autonomic user guided experiments with efficient ML/ANN parameterization of experiments, tracking of important meta data as well as optimized artefact correction and reconstruction algorithms are required. Quantification of quality criteria as well as combination with simulation are desirable.

      Speakers: Dr Simon Zabler (Fraunhofer IIS), Astrid Hoelzing (Fraunhofer IIS)
    • 11:25 11:30
      Data Evaluation Group 5m

      Our newly established Data Evaluation Group at research neutron source Heinz Maier-Leibnitz (FRM II) near Munich offers support for processing and evaluation of experimental data collected at selected neutron and x-ray instruments. This service is particularly focused to support infrequent or new users to facilitate data analysis to obtain meaningful results ready for publication in short time. This includes guidance to data reductions steps after the experiments, data evaluation and interpretation with use of common software packages, and support in publication writing.

      Another successful activity of our group is launching series of method-based one-day educational workshops with support of instrument scientists and software providers, each workshop offering insights into a different neutron technique and the related software needed for data analysis.

      With this concept, we aim for several goals such as training the new generation of neutron scientists, encouraging interdisciplinary partnerships between experienced scientists, and interactive sessions with software providers to obtain users’ feedback for further software development.

      Speaker: Dr Neelima Paul (TUM)