PUNCH4NFDI Annual Meeting 2023

Europe/Berlin
Schellingstr. 4, Munich
Description

The 2023 in-person Annual Meeting by PUNCH4NFDI will take place on 12 - 13 October at LMU, Munich.

The registration is open. Please register until 5 October for on-site participation. For remote participation at the public sessions, registration is possible until 13 October.

Participation on the first day, Thursday 12 October, is for PUNCH4NFDI members only. On the second day, Friday 13 October, there will be public sessions where everybody interested in our work is cordially invited to participate and contribute.

The call for abstracts is closed.

Venue: the meeting takes place at the Faculty of Physics at LMU, Schellingstr. 4. It can easily be reached by bus or U-Bahn, get off at "Universität" or "Universität - München".

The pre-meeting dinner on Wednesday is at the San Benno restaurant, Loristraße 14, 80335 München at 19:30.

You can find the talks that were uploaded to Sync&Share here

Contact PUNCH4NFDI
  • Thursday, 12 October
    • 08:30 09:00
      Welcome coffee 30m
    • 09:00 10:45
      TA reports SCH4 - H030

      SCH4 - H030

      Convener: Thomas Schörner (DESY)
    • 10:45 11:20
      Topical Discussion SCH4 - H030

      SCH4 - H030

    • 11:20 11:40
      Coffee break 20m
    • 11:40 12:30
      Topical Discussion: Workflows SCH4 - H030

      SCH4 - H030

      Conveners: Harry Enke (AIP), Thomas Kuhr (BELLE (BELLE II Experiment))
    • 12:30 14:00
      Lunch 1h 30m
    • 14:00 15:00
      Topical Discussion: Metadata across PUNCH4NFDI SCH4 - H030

      SCH4 - H030

      Convener: Susanne Pfalzner (Forschungszentrum Jülich)
    • 15:00 15:45
      TA meetings: TA5 SCH-H537

      SCH-H537

      Conveners: Andreas Redelbach (Frankfurt Institute for Advanced Studies), Michael Kramer (Max-Planck-Institut fuer Radioastronomie)
    • 15:00 15:45
      TA meetings: TA6 SCH-H206

      SCH-H206

      Conveners: Dr Kilian Schwarz (IT (IT Scientific Computing)), Stefan Wagner (LSW, ZAH, U HD)
    • 15:45 16:00
      Coffee break 15m
    • 16:00 16:45
      Subgroup / Board meetings: Women4PUNCH SCH4 - H537

      SCH4 - H537

      Convener: Christiane Schneide (DESY)
    • 16:45 18:00
      Subgroup / Board meetings: Joint Meeting of CB, SAB, IRB and UC SCH4 - H206

      SCH4 - H206

      Conveners: Andreas Haungs (Karlsruhe Institute of Technology - KIT), Thomas Schörner (DESY)
    • 19:00 20:30
      Dinner 1h 30m Löwenbräukeller

      Löwenbräukeller

      Nymphenburger Straße 2 D-80335 München
  • Friday, 13 October
    • 08:30 09:00
      Welcome coffee 30m
    • 09:00 10:00
      Subgroup / Board meetings: MB Meeting SCH4 - H206

      SCH4 - H206

      Convener: Christiane Schneide (DESY)
    • 09:00 10:00
      Subgroup / Board meetings: Young Scientists Meeting SCH4 - H537

      SCH4 - H537

    • 10:00 10:20
      Coffee break 20m
    • 10:20 12:00
      Public Session: Open presentations I SCH4 - H030

      SCH4 - H030

      • 10:20
        Status of Compute4PUNCH 25m

        The status of Compute4PUNCH, which federates different heterogenous compute resources for the PUNCH4NFDI community, will be presented. The latest developments regarding the manipulation and refreshment of Helmholtz AAI access tokens, mandatory to access the Storage4PUNCH resources from the Compute4PUNCH worker nodes, will be highlighted.

        Speaker: Benoit Roland
      • 10:45
        REANA as one central element of the PUNCH SDP 25m

        REANA is a connecting engine that enables workflows for scientific tasks and access to the required resources. Basic features and usage will be demonstrated.

        Speaker: Harry Enke (AIP)
      • 11:10
        Reduction of MeerKAT interferometric data in PUNCH4NFDI 25m

        In this talk I will describe a use case for the PUNCH4NFDI infrastructure which involves synergy among all of its components: computing resources (provided by Compute4PUNCH), storage resources (provided by Storage4PUNCH), workflow management, products' metadata management, and solutions for the reproducibility of scientific analyses. The identified use case is the reduction of MeerKAT interferometric data taken in the new “OTF” mode. Interferometric scanning or “on-the-fly” (OTF) imaging provides a dramatic improvement in data acquisition efficiency by removing the settle-and-slew overhead and by enabling the commensal observing for intensity mapping and interferometric imaging. This new observing mode is currently being tested at MeerKAT in the context of the MeerKLASS survey, which will target an area of 10˙000 square degrees with 2500 hours worth of observations. I will briefly describe our semi-automatic pipeline, developed for scalable interferometric OTF imaging. I will explain how the pipeline is organized in a sequence of steps (flagging, rotation of phase centers, imaging, co-adding, source extraction) and generally how these are implemented in an optimized way to deal with a massive amount of data. I will finally report on tests of the deployment of the pipeline to existing and future computing infrastructures, such as those provided by PUNCH4NFDI. Ultimately, our experience suggests that a wide range of astronomy data analysis and processing tasks could also be carried out using the new PUNCH4NFDI infrastructure.

        Speaker: Nicola Malavasi
      • 11:35
        Learning from the present for the future: the Jülich LOFAR Long-term Archive 25m

        Forschungszentrum Jülich has hosted the German part of the LOFAR long-term archive since 2013. Currently about 20 petabytes (PB) of data are stored with a growth rate of around 2 PB a year.
        Future radio telescopes are expected to have a much higher data rate and bring new challenges in processing and storing data.
        Here we briefly report on the current data management of the Jülich LOFAR Data Archive, including the ingestion, the storage system, the export to the long-term archive, and the request chain. We analysed the data access pattern over the last 10 years and give an estimation of the energy consumption of the process. Based on this analysis, we define requirements for future, even more extensive long-term data archives.

        Speaker: Holger Stiele
    • 12:00 13:30
      Lunch 1h 30m
    • 13:30 15:10
      Public Session: Open presentations II SCH4 - H030

      SCH4 - H030

      • 13:30
        ErUM-Data : Preparation of new calls for funding 25m

        This contribution is thought as a slot for initial discussion on input from PUNCH to the strategy meeting of the ErUM Digitization Board with the BMBF to prepare the next round of project calls.
        A short introduction will be given to stimulate the discussion.

        Speaker: Andreas Haungs (KIT / PUNCH)
      • 13:55
        AMPEL: Scientific exploration in the era of high throughput astronomical observatories 25m

        The rapid development of detector technology, including these sensitive to gravitational waves and neutrinos, has brought us to the gate of an era where we will be able to observe transient events as they unfold throughout a large fraction of the Universe. The availability of these data floods requires new systems for data processing and the consistent application of modern statistical methods.

        I will here introduce AMPEL, an open source development platform for real-time data analysis. Users develop and tune complex workflows in a local development environment which can be uploaded to a computer center for large-scale live processing or shared for reproducibility, effectively introducing the "code-to-data" paradigm in astronomy.

        AMPEL is already a critical component of current era real-time multi-programs and will be one of the brokers for the LSST real-time alert stream. Results from the ELAsTiCC simulations show that the technology and photometric classification methods are now mature for these data rates.

        Speaker: Jakob Nordin (Humboldt-Universität zu Berlin)
      • 14:20
        ML-based Pipeline for Pulsar Analysis (PPA) 25m

        The detection of radio signals originating from pulsars poses a formidable challenge, mainly due to the omnipresent terrestrial and extraterrestrial radio interference. In view of the upcoming Square Kilometre Array Observatory (SKAO) with its huge data streams, the interfering signals have to be identified in real time already during the data acquisition phase. For this reason, we have developed the ML-based Pipeline for Pulsar Analysis (ML-PPA) with an overall architecture that addresses real-time and Big Data requirements.

        In the initial phase, we employ an innovative in-silico method to generate artificial radio pulsar signals. These synthetic signals serve as the foundation for training a UNet-based and a CNN-based neural network, specially crafted to perform precise segmentation of pulsar signals from the surrounding terrestrial radio interferences. A parallel effort is also executed to convert the python package into an efficient C++ code backend and containerized for better scalability. This segmentation process is critical, as it enables us to isolate and classify the signals of interest in a noisy environment.

        Following successful segmentation, we implement a comprehensive machine learning pipeline. This pipeline leverages a range of techniques to classify and categorize the segmented signals, providing valuable insights into the nature of the detected pulsar emissions. This classification step is instrumental in understanding the diversity of pulsar signals and their potential sources.

        Our proposed pipeline represents a significant advancement in the realm of pulsar signal analysis. By combining synthetic signal generation, neural network segmentation, and machine learning classification, we offer a solution for extracting and characterizing pulsar signals amidst challenging radio noise scenarios. This approach promises to enhance our ability to uncover and understand the pulsar phenomena along with other enigmatic astronomical signals, contributing to the broader field of astrophysical research.

        In the talk, an overview of the very first version 0.1 of the framework ML-PPA is given

        Speaker: Tanumoy Saha (HTW Berlin)
      • 14:45
        Metadata schema for PUNCH Data Portal 25m

        In order to provide PUNCH4NFDI users with digital research products (DPRs) through PUNCH4NFDI Science Data Portal in a FAIR way, one should enable users with the opportunity to discover DRPs (which is achieved by the use of descriptional metadata) and as well provide them with a decent amount of other information for decision-making (such as legal, structural, etc. metadata). Generic metadata standards, such as Data Cite Kernel or Dublin Core, are aimed to be used as a common denominator to describe any known digital or physical resource. These general-purpose schemas are well suited to be used as a base level of field-specific metadata standards. However, extensions are required in order to make particular subject-oriented resources well-discoverable. Thus, a multi-level metadata schema is essential in order to provide a decent user experience within a particular NFDI consortium. The proposals of particular extensions are being built upon the work with available community use cases, which meet the needs of the Data Portal's end users. The design of the metadata schema is well linked to the PID management and the development of a metadata repository.

        The contribution will cover the current state of the subject and possible future works.

        Speaker: Victoria Tokareva (KIT)
    • 15:10 15:40
      Public Session: Poster session & Coffee SCH4 - H030

      SCH4 - H030

    • 15:40 16:55
      Public Session: Open presentations III SCH4 - H030

      SCH4 - H030

      • 15:40
        Reproducing H.E.S.S. dark matter limits with Gammapy, an open-source Python package for gamma-ray astronomy 25m

        The search for dark matter (DM) is a long-standing quest. Accelerator and underground experiments provide direct limits on DM. Using astronomical observations at TeV energies, it is possible to derive indirect limits on the annihilation cross section of DM particles. While original data, pipelines, and even derived data are typically not available for joint analyses, we illustrate the TA6 work on open data and open tools with one example: we show how to replicate the limits derived with H.E.S.S. – the best-located Imaging Atmospheric Cherenkov Telescopes observatory in the Southern hemisphere to observe the center of the Milky Way and thus derive the most constraining limits for WIMP annihilation cross section – with published data and Gammapy, an open-source Python package for gamma-ray astronomy built on Numpy, Scipy, and Astropy. The code for the analysis is hosted in the PUNCH TA6 WP4 GitLab repository, where we keep testing CI/CD pipelines for open-source PUNCH software. Moreover, several analyses of TeV gamma-ray sources can be performed with public H.E.S.S. data, a part of which we have made accessible via the Virtual Observatory. This is meant as part of TA6’s effort to facilitate synergies and exchange within different physics domains, which are investigating the same subject but with different data and analysis techniques.

        Speaker: Dr Alessandro Montanari (ZAH, Landessternwarte, Heidelberg University)
      • 16:05
        A model-independent likelihood function for the Belle II $B^+\to K^{+} \nu \bar{\nu}$ analysis 25m

        Rare decays like $B^+ \to K^+ \nu \bar{\nu}$, searched for by the Belle II collaboration, are important in particle physics research as they offer a window into physics beyond the Standard Model. However, the experimental challenges induced by the two final state neutrinos require assumptions on the kinematic distribution of this decay. Consequently, the results feature a model dependency arising from both Standard Model assumptions and from the description of the pertinent hadronic matrix element, making reinterpretation complicated without reanalysing the underlying data.

        In this work, we address this issue by deriving a model-independent likelihood function, parameterizing the theory space in terms of Wilson coefficients of the weak effective theory, and reweighting the signal template according to the predicted kinematic signal distribution.
        By extending the pyhf fitting software and interfacing it with the EOS software for flavor physics phenomenology, we can perform a runtime update of the theoretical model, enabling us to derive exclusion limits in the space of Wilson coefficients.

        Once public, the model-independent likelihood function will be a useful tool for the particle physics community to perform tests on existing theoretical models. Publishing such likelihoods is crucial for a full exploitation of experimental results.

        Speaker: Lorenz Ennio Gaertner (BELLE (BELLE II Experiment))
      • 16:30
        Towards measurement of the information content in radio telescope streams 25m

        Over the last decades of the developments in radio astronomy, the data rates and data complexity increased many times over. It is connected to multiple factors, among which the fact that the main focus of the radio astronomical research was shifted towards time-varying and transient signals. In addition to that, the technological development of the variety of ground- and space-based telecommunication systems contributes to the intensification of the radio-interference background. All of these factors together lead to the current situation of impossibility to record all the signals during the observations to process them afterwards. Amount of data exceeds all feasible storage volumes. Unavoidably, some of the signals or their parts should be rejected in some semi- or fully-automated procedures and be lost. It seems reasonable to adopt ideas from the information theory to overcome some issues of the data loss and identify which signals should be recorded to have successful observations. Our main goal is to develop new approaches and techniques that will be able to identify "amount" of information in the data stream of radio telescopes and help to identify how we can record the least amount of data while keeping the most amount of information. On the current stage of our project we are working towards two directions. First, we are looking for mathematical connections between different methods used for signal detection. Second, we are investigating different ways to identify global and local features of signal and to compute the information content of the data stream based on this features. In this contribution, I will show the general picture of the signal detection and compression from the statistical point of view, and the approaches to computing the information content in this framework.

        Speaker: Dr Vladimir Lenok (Universität Bielefeld)
    • 16:55 17:05
      Public Session: Farewell SCH4 - H030

      SCH4 - H030

      Convener: Thomas Schörner (DESY)