Speaker
Giuseppe Bruno
Description
In this talk, we study the evolution of tokens across the depth of encoder-only transformer models at inference time, modeling them as a system of interacting particles in the infinite-depth limit. Motivated by techniques for extending the context length of large language models, we focus on the moderate interaction regime, where the number of tokens is large and the inverse temperature parameter scales accordingly. In this setting, the dynamics exhibit a multiscale structure. Using PDE analysis, we identify different phases depending on the choice of parameters.