abstract:
The Researchers in High Energy and Nuclear Physics (HENP), at CERN and worldwide, need to efficiently analyze petabytes of data. As the number of particles produced in each and every collision is a priori unknown, HENP data does not have a canonical tabular or tensor layout but it is instead modeled by more complex hierarchical collections. The efficient storage, retention, and data access of (subsets of) the recorded data is critical to the experiments’ physics programs.
This presentation discusses HENP’s columnar storage techniques, in particular the new RNTuple I/O system of the ROOT data analysis framework. RNTuple represents a major I/O upgrade of the established TTree I/O system. The RNTuple I/O is the result of a multi-year open and ongoing R&D effort. A first, stable RNTuple on-disk format was released in November 2024. The RNTuple I/O responds to the unprecedented challenges to event data I/O in terms of data rates, event sizes and event complexity posed by future collider experiments, such as DUNE or the experiments at the High-Luminosity LHC. At the same time, RNTuple is designed to adapt to a changing I/O landscape, where with traditional Grid
storage and spinning disk pools are mixing with HPC cluster file systems and object stores, cloud storage, and NVMe disk cache layers in analysis facilities.
==============================================
Connection details:
ZOOM Meeting “PUNCHLunch seminar”:
https://desy.zoom.us/j/91916654877
Webinar ID: 919 1665 4877, passcode: 481572