–
I am a Research Scientist at Google DeepMind.
Before that, I obtained my PhD degree at Ecole Normale Supérieure (ENS Paris), working with Gabriel Peyré and Mathieu Blondel.
I graduated from École Polytechnique (X2016) and have a master degree from ENS Paris-Saclay in mathematics, vision and learning (MVA), as well as a master degree from Sorbonne Université in mathematics (Modelling).
Contact: michael (dot) sander (at) polytechnique (dot) org
Publications
Michael E. Sander, Gabriel Peyré. Towards Understanding the Universality of Transformers for Next-Token Prediction. Preprint.
Michael E. Sander. Deeper Learning: Residual Networks, Neural Differential Equations and Transformers, in Theory and Action. PhD Manuscript.
Michael E. Sander, Raja Giryes, Taiji Suzuki, Mathieu Blondel, Gabriel Peyré. How do Transformers perform In-Context Autoregressive Learning?. ICML, 2024 Paper, GitHub
Pierre Marion, Yu-Han Wu, Michael E. Sander, Gérard Biau. Implicit regularization of deep residual networks towards neural ODEs. ICLR, 2024 (Spotlight). Paper, GitHub
Michael E. Sander, Joan Puigcerver, Josip Djolonga, Gabriel Peyré, Mathieu Blondel. Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective. ICML, 2023. Paper, GitHub
Michael E. Sander, Pierre Ablin, Gabriel Peyré. Do Residual Neural Networks discretize Neural Ordinary Differential Equations? NeurIPS, 2022. Paper, GitHub
Samy Jelassi, Michael E. Sander, Yuanzhi Li. Vision Transformers provably learn spatial structure. NeurIPS, 2022. Paper
Michael E. Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyré. Sinkformers: Transformers with Doubly Stochastic Attention. AISTATS, 2022. Paper, GitHub, short presentation
Michael E. Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyré. Momentum Residual Neural Networks. ICML, 2021. Paper, GitHub, short presentation