Dr. Alexander Schell
Technical Unversity of Munich
Advancing Sequential AI Models: New Mathematics to Bridge Stochastic Dynamics and Machine Learning
My research project focuses on the mathematical development and understanding of sequence-processing machine learning algorithms, hereinafter referred to as sequence models. These models analyse temporally ordered data points, such as financial data, traffic data, weather patterns or speech signals, in order to make statistically sound predictions about future developments and other relevant properties of these data series based on past observations. Sequence models are an integral part of modern generative AI algorithms and are used intensively in various application areas such as language processing, protein structure prediction or in the modelling of dynamic systems.
The overarching goal of my project is to use methods from stochastic dynamics and rough path theory to analyse the so-called ‘latent structures’ — i.e. internal, not directly visible, often geometrically encoded properties of the models, which are caused by the dynamics of the data and influence the behaviour and predictions of the models — of a broad class of sequence models in order to better understand and specifically improve their functionality. In particular, precise mathematical approaches and tools will be developed to make these models conceptually more manageable and interpretable and to quantify their ability and reliability in processing complex, uncertain data. The knowledge gained should contribute to mathematically differentiating and measurably increasing the accuracy and robustness of the investigated models so that they remain consistent under realistic conditions and become more resistant to unexpected disturbances and model deviations. A particular focus is on the development of rigorously derived guarantees that describe and ensure the stability and efficiency of the models in practical scenarios.
In the long term, the project aims to develop a deeper mathematical understanding of sequence models by combining machine learning and stochastic dynamics, thereby significantly improving their potential applications in various areas of artificial intelligence through precise quality guarantees.