FH SciComp Workshop 2025

Name: FH SciComp Workshop 2025
Start: 2025-07-03T14:00:00+02:00
End: 2025-07-04T14:00:00+02:00
Location: DESY Campus Hamburg

3 Jul 2025, 14:00 → 4 Jul 2025, 14:00 Europe/Berlin

Seminar Room 4a/b (DESY Campus Hamburg)

Seminar Room 4a/b

DESY Campus Hamburg

Description

This is the 2nd annual workshop that is organized in the frame of the FH platform on scientific computing (XWIKI FH-SCiComp). Main goals of the workshop are

Scan through the ongoing research activities
Start and intensify contacts between research doing SciComp
Identify and strengthen synergies

The workshop is primarily intended for early career researchers, who are actively pursuing the work. Of course, also interested staff is invited.

Topics:

heterogeneous computing (software exploiting GPUs, FPGAs)
(generative) Machine Learning & AI applications
fair data management
research software engineering (incl. generators and analysis tools)
quantum computing
DAQ , online & real time applications, trigger
Collaborative tools
Other cool topics we have missed
...

Please register to help us organizing the breaks.

Please fill out the this survey about the gathering after the session on Thursday.

More details will follow soon. Ideally subscribe to the mailing list of platform fh-scicomp@desy.de (via the DESY mailing list server).

Participants

34 View full list

Thursday 3 July
- Thu 3 Jul
- Fri 4 Jul
- Block 1 - Infrastructure & Tools Seminar Room 4a/b
  
  Seminar Room 4a/b
  
  DESY Campus Hamburg
  - 1
    
    Introduction
  - 2
    
    Asapo: A high-performance streaming framework for real-time data analysis
    
    Asapo is a streaming framework developed and deployed at DESY to support data acquisition and online data analysis on processing clusters. It efficiently enables high-bandwidth communication between state-of-the-art detectors, storage systems, and independent analysis processes. The deployed solution provides real-time feedback and is offered as a service to DESY scientists. Several data-processing pipelines based on Asapo are deployed at different beamlines at the PETRA III facility.
    
    Speaker: Mikhail Karnevskiy (IT (IT Scientific Computing))
    
    misha.pdf
  - 3
    
    (Ab)using github actions for fun and profit
    
    Continuous Integration (CI) has become a key ingredient for scientific software development. Github is one of the main solutions that are currently in use for hosting open source code. It offers a CI system called "github actions" which allows to develop composable and reusable actions. In this contribution we show some of the possibilities these github actions offer. Starting from the basics of building and testing changesets, we show how to speed up that process, but also how to use github actions to build full fledged software images for the muon collider community.
    
    Speaker: Thomas Madlener (FTX (FTX Fachgruppe SFT))
    
    gh_actions_fc_scicomp_jul25.pdf
  - 4
    
    interTwin Digital Twin Engine: Enabling Federated Scientific Workflows
    
    The interTwin project, funded by Horizon Europe, is building an open-source Digital Twin Engine (DTE) to support interdisciplinary scientific Digital Twins (DTs) across domains such as High Energy Physics, Astrophysics, Climate Science, and Environmental Monitoring. The platform is co-designed by infrastructure providers, technology experts, and domain scientists to streamline the creation and deployment of complex DT workflows.
    
    The DTE enables the execution of containerized workflows on heterogeneous computing backends (HPC, HTC, Cloud) through InterLink, a component that abstracts Kubernetes pod execution to remote resources using an extended Virtual Kubelet architecture. Complementing this, the interTwin Data Lake provides a federated data layer that enables efficient data access and management across multiple storage sites, supporting FAIR-aligned data practices to facilitate interoperability and reuse. It is built on technologies like Rucio, FTS and integrates identity and authorization via EGI Check-in.
    
    To support site integration and user collaboration, the project introduces Teapot, a multi-tenant WebDAV interface developed within interTwin. Teapot enables storage sites without native WebDAV support to add WebDAV access seamlessly while preserving file ownership, making it fully compatible with HPC storage systems. This approach allows sites to securely expose storage to diverse user communities. Teapot also integrates with ALISE, which lets users link local accounts with multiple external identities, facilitating seamless access across federated environments.
    
    Speaker: Dijana Vrbanec (IT (Research and Innovation in Scientific Co))
    
    interTwinDTE_FH_SciComp20251.pdf
- 15:30
  
  Discussion Session (w/ refreshments) Seminar Room 4a/b
  
  Seminar Room 4a/b
  
  DESY Campus Hamburg
- Block 2 - Data Storage and Publication Seminar Room 4a/b
  
  Seminar Room 4a/b
  
  DESY Campus Hamburg
  - 5
    
    Successes, challenges and lessons from the ZEUS data preservation program
    
    The data of the ZEUS experiment at HERA and their usage have been converted to "preservation mode" in 2012, and new physics results have continuously been published from these data since then.
    A description will be given of the original plan, its implementation, the latest status of data, software and knowledge access, some corresponding results, and the related challenges.
    
    Speaker: Achim Geiser (CMS (CMS Fachgruppe QCD))
    
    ZEUSDPSciComp_2025.pdf
  - 6
    
    An Update from public-data.desy.de
    
    I will give a quick update on the FAIR and Open Data portal public-data.desy.de to outline how we employ metadata schemata for validated ingestion before community curation and finally show that we can mint DOIs that resolve data locations directly
    
    Speaker: Tim Wetzel (IT (Research and Innovation in Scientific Co))
    
    Wetzel-CatalogueConceptForOpenData.pdf
    
    Wetzel-CatalogueConceptForOpenData.pptx
  - 7
    
    Data Access and Storage Systems at DESY
    
    Data access is an essential component of the scientific process, as is the storage system where that data resides. There is no one-size-fits-all solution, which makes selecting the right storage system a non-trivial task.
    
    In this presentation, I will briefly overview the file and storage systems used at DESY and offer guidelines for matching different workloads to appropriate storage solutions.
    
    Speaker: Tigran Mkrtchyan (DESY-IT, Scientific Computing)
    
    storage-systems-overview.pdf
- Snacks and Drinks Building 2
  
  Building 2
Friday 4 July
- Thu 3 Jul
- Fri 4 Jul
- Block 3 - Applications Seminar Room 4a/b
  
  Seminar Room 4a/b
  
  DESY Campus Hamburg
  - 8
    
    Hardware Portable Data analysis Building Blocks for High Luminosity LHC Era: The first building block
    
    The increasing data flow for LHC Run4 requires an architecture capable of handling massive parallelism. This shift demands a heterogeneous environment, creating the need for heterogeneous programming. Various ecosystems like KOKKOS and alpaka provide portability across different accelerators such as CPUs, GPUs, and FPGAs which are the most commonly used at CERN experiments. They achieve this by switching backends at runtime, making them well-suited for such diverse environments.
    
    The CMS experiment has integrated alpaka (an abstraction library for parallel kernel acceleration) into its data analysis software, CMSSW, for several years. The current goal is to migrate as many components as suitable from traditional serial CPU processing to GPU-based processing (both CUDA and ROCm) using alpaka.
    
    One key component is the Phase 2 Outer Tracker data unpacker, which translates RAW data from the DAQ output to the reconstruction step. This component presents an excellent opportunity for increased parallelization and performance gains. The project presented here marks the first step toward building hardware-portable data analysis components for the HL-LHC which is the porting of the Phase 2 Outer Tracker Unpacker from the CPU-based CMSSW model to an alpaka-based CMSSW model.
    
    Speaker: Mohammad Momed (IT (Informationstechnik))
    
    Hardware Portable Data analysis Building Blocks for High Luminosity LHC Era: The first building block.pdf
  - 9
    
    Computer algebra in theoretical particle physics
    
    We show some examples of computer algebra used in theoretical particle physics. The examples include analytic integration and summation and solving systems of linear, differential and difference equations and are based on the computer algebra packages Mathematica, Maple and Form.
    
    Speaker: Peter Marquard (DESY)
    
    marquard.pdf
  - 10
    
    Pepper: a framework for columnar data analysis in CMS
    
    We present Pepper, a general purpose framework for CMS data analysis developed at DESY. This is a python-based framework using columnar processing and the scikit-hep ecosystem of libraries, notably awkward and coffea. This approach offers processing speeds comparable to C++ frameworks, while being more approachable for recent graduates with experience of numpy and similar tools. Pepper extends on its base packages by offering helper classes and functions, a simple configuration interface, working examples for simple analyses and easily extensible code. The framework has been used in approximately 5 published analyses, with about 15 further analyses in progress, mostly at DESY. We will discuss lessons learnt from developing this framework, as well as challenges from an analysis and computing perspective.
    
    Speaker: Dominic Stafford (CMS (CMS Fachgruppe Searches))
    
    Pepper_040725_FH_SciComp.pdf
  - 11
    
    Automated Quality Control for SiPM-on-Tile Modules in the CMS HGCAL Upgrade
    
    The CMS High Granularity Calorimeter (HGCAL) will be an
    entirely new calorimeter for the high-luminosity phase of the LHC. It
    comprises hexagonal silicon modules and scintillating tile modules with
    silicon photomultipliers for readout, i.e., SiPM-on-tile modules. At
    DESY, about 2,000 modules are going to be assembled, which requires
    rigorous and automated quality control (QC) procedures. This
    contribution will show the design and workflow of the developed QC
    framework. Its modular approach allows the QC tests to be broken down
    into a series of steps, which involve interfacing with specialized
    hardware for data acquisition, studying the module's response when
    changing configuration parameters, performing on-the-fly data analysis,
    deriving calibration parameters, and automated reporting. The framework
    offers a one-click solution to enable even non-expert users to execute
    the QC procedures and determine if an assembled SiPM-on-tile module
    meets the standards required for integration into the CMS HGCAL.
    
    Speaker: Matthias Komm (DESY (Deutsches Elektronen Synchrotron))
    
    mkomm_qc_for_hgcal.pdf
- 10:55
  
  Discussion Session (w/ refreshments) Seminar Room 4a/b
  
  Seminar Room 4a/b
  
  DESY Campus Hamburg
- Block 4 - Machine Learning Applications Seminar Room 4a/b
  
  Seminar Room 4a/b
  
  DESY Campus Hamburg
  - 12
    
    Flavour Tagging with Multi-Modal Transformers with the ATLAS Detector
    
    Flavour-tagging is a critical component of the ATLAS experiment physics programme. Existing flavour tagging algorithms rely on several low-level taggers, which are a combination of physically informed algorithms and machine learning models. A novel approach presented here instead uses a single machine learning model based on reconstructed tracks, avoiding the need for low-level taggers based on secondary vertexing algorithms. This new approach reduces complexity and improves tagging performance. This model employs a transformer architecture to process information from a variable number of tracks and other objects in the jet in order to simultaneously predict the jets flavour, the partitioning of tracks into vertices, and the physical origin of each track. The inclusion of auxiliary tasks aids the models interpretability. The new approach significantly improves jet flavour identification performance compared to existing methods in both Monte-Carlo simulation and collision data. Notably, the versatility of the approach is demonstrated by its successful application in boosted Higgs tagging using large-R jets.
    
    Speaker: Neelam Kumari (ATLAS (ATLAS SM and Beyond))
    
    SCIComp_FTAG_kumari.pdf
  - 13
    
    CaloClouds3; Diffusion and normalising flows
    
    This contribution presents the final iteration of the CaloClouds series. Simulation of photon showers in the granularities expected in a future Higgs factory is computationally challenging. A viable simulation must capture the find details exposed by such a detector, while also being fast enough to keep pace with the expected rate of observations. The Caloclouds model utilises point cloud diffusion and normalising flows to replicate MCMC simulation with exceptional actuary. First we will make a lightning overview of the models objectives and constraints. To describe the upgrades for the latest version, we detail the studies on the flow model and the optimisations made, and then summarise the steps taken to generalise CaloClouds 3 for using in the whole detector. Considering some of the underlying principles of model design, we look at the significance of the data format choice on model outcomes. Finally, we present the results of reconstructions performed on CaloClouds 3 output against the results from Geant4 simulation, thus demonstrating that this model provides reliable physics reproductions.
    
    References
    https://arxiv.org/pdf/2309.05704
    
    Speaker: Henry Day-Hall (FTX (FTX Fachgruppe SFT))
    
    CaloClouds3.pdf
  - 14
    
    Helmholtz Model Zoo
    
    The Helmholtz Model Zoo (HMZ) is a cloud-based platform that gives Helmholtz researchers seamless access to deep learning models via a web interface and REST API. It enables easy integration of AI into scientific workflows without moving data outside the Helmholtz Association.
    
    Scientists across all 18 Helmholtz centers can contribute models through a streamlined GitLab submission process. HMZ automatically tests, deploys, and generates web and API interfaces based on submitted model metadata, supporting diverse use cases with minimal effort.
    
    Inference runs on GPU nodes (4×NVIDIA L40) at DESY Hamburg via NVIDIA Triton Server, with data stored securely in HIFIS dCache InfiniteSpace and owned by the uploading user. Both open and restricted model sharing are supported.
    
    Developed by Helmholtz Imaging at DESY, with support from HIFIS and Helmholtz AI.
    
    Speaker: Engin Eren (DESY / Helmholtz Imaging)
    
    HMZ_FHComp_talk.pdf
  - 15
    
    Closing Discussion

Choose timezone

FH SciComp Workshop 2025

Seminar Room 4a/b

DESY Campus Hamburg

Seminar Room 4a/b

DESY Campus Hamburg

Seminar Room 4a/b

DESY Campus Hamburg

Seminar Room 4a/b

DESY Campus Hamburg

Building 2

Seminar Room 4a/b

DESY Campus Hamburg

Seminar Room 4a/b

DESY Campus Hamburg

Seminar Room 4a/b

DESY Campus Hamburg