PCD Data Science Basics

Data Science Lecture: Ultrafast Machine Learning Inference in FPGAs at the LHC

by Thea Aarrestad (CERN)


With edge computing, real-time inference of deep neural networks (DNNs) on custom hardware has become increasingly relevant. Smartphone companies are incorporating Artificial Intelligence (AI) chips in their design for on-device inference to improve user experience and tighten data security, and the autonomous vehicle industry is turning to application-specific integrated circuits (ASICs) to keep the latency low.

While the typical acceptable latency for real-time inference in applications like those above is O(1) ms, other applications require sub-microsecond inference. For instance, high-frequency trading machine learning (ML) algorithms are running on field-programmable gate arrays (FPGAs), highly accurate devices, to make decisions within nanoseconds. At the extreme inference spectrum end of both the low-latency (as in high-frequency trading) and limited-area (as in smartphone applications) is the processing of data from proton-proton collisions at the Large Hadron Collider (LHC) at CERN. Here, latencies of O(1) microsecond is required and resources are strictly limited.
In this lecture I will discuss how ML in FPGAs can improve the event selection process in particle detectors at the LHC, discuss and demonstrate how to reduce the memory footprint of ML models using state-of-the art techniques such as model pruning and quantization, and demonstrate how to design and deploy a fast deep neural network on a FPGA using the hls4ml library.
This event is part of a series of lectures and tutorials on data science topics hosted by the Platform for Challenges in Data Science in the excellence cluster "Quantum Universe" between DESY and Universität Hamburg. It is intended specifically for the PhD students in the cluster but younger and more senior members are of course also welcome.
Organized by

Gregor Kasieczka, Matthias Schröder