This workshop will focus on the transformer architecture and its underlying (self-)attention mechanisms that gained substantial interest in recent years. Despite their empirical success and groundbreaking advances in natural language processing, computer vision, and scientific computing, the mathematical understanding of transformers is still in its infancy, with many fundamental questions only starting to be posed and addressed.
We aim to bring together researchers with backgrounds in multi-agent dynamics, optimal transport, and PDEs, to initiate discussions on a variety of aspects connected to the theoretical principles governing transformers. By fostering discussions, we seek to advance this young and rapidly evolving research field, uncovering new mathematical perspectives on transformer models.
Confirmed speakers
- Giuseppe Bruno (University of Bern)
- Valérie Castin (ENS Paris)
- Subhabrata Dutta (TU Darmstadt)
- Borjan Geshkovski (Inria Paris)
This is a satellite event to the Conference on Mathematics of Machine Learning 2025 that takes place at TUHH from September 22nd-25th 2025.
We gratefully acknowledge support by the DFG funded priority programme Theoretical Foundations of Deep Learning and Helmholtz Imaging.