26–28 Apr 2022
Europe/Berlin timezone
Thank you for your participation. We greatly enjoyed it.

Reliability of Artificial Neural Networks under Hardware Faults

Not scheduled
2h
CFEL

CFEL

Poster CCU (Computational Core Unit) Poster session with buffet

Speaker

Fin Hendrik Bahnsen (Hamburg University of Technology)

Description

Some machine learning algorithms use statistical gradient-based learning methods in a data driven way to solve problems. These methods find correlations in the presented datasets and, thus, also for problems that are difficult to solve with classical algorithms. Lately, so-called artificial neural networks (ANNs) have become one of the most important and indispensable machine learning tools in many application domains. Applications of ANNs range from complex control tasks to data analysis of multivariate problems. But the use of ANNs in difficult problems always comes at the cost of the necessary computing power. Thus, it is often necessary to use hardware acceleration with graphical processing units (GPUs) or field programable gate arrays (FPGAs) in order to obtain inference results of an ANN fast enough. At the same time, the question, how reliable and reproducible the results of an ANN model are, arises more and more frequently. If the application field is safety-critical, e.g., in a setup where humans operate beside automated machines, then the answer to such questions is even essential to avoid life threatening situations.

There is a large number of research publications that deal with these questions, but very often hardware-related aspects are ignored. Our work clearly shows that it is not enough to focus only on the machine learning algorithm behind the ANN. Especially for state-of-the-art models that use hardware acceleration productively, it is indispensable to consider the hardware itself as a source of failure. We present our toolchain that considers failures in the underlying hardware to determine reliability for specific applications. In addition, we show a complete analysis for a concrete model using our toolchain, including sources of failure that may lie in the hardware, and discuss terms such as robustness that are closely related to the system reliability.

Primary authors

Fin Hendrik Bahnsen (Hamburg University of Technology) Goerschwin Fey (TU Hamburg)

Presentation materials

There are no materials yet.