I am a Research Scientist at Google DeepMind.

Before that, I was a PhD student at Ecole Normale Supérieure (ENS Paris), with Gabriel Peyré and Mathieu Blondel.

I graduated from École Polytechnique (X2016) and have a master degree from ENS Paris-Saclay in mathematics, vision and learning (MVA), as well as a master degree from Sorbonne Université in mathematics (Modelling).

Contact: michael (dot) sander (at) polytechnique (dot) org

Publications

  • Michael E. Sander, Gabriel Peyré. Towards Understanding the Universality of Transformers for Next-Token Prediction. Preprint.

  • Michael E. Sander. Deeper Learning: Residual Networks, Neural Differential Equations and Transformers, in Theory and Action. PhD Manuscript.

  • Michael E. Sander, Raja Giryes, Taiji Suzuki, Mathieu Blondel, Gabriel Peyré. How do Transformers perform In-Context Autoregressive Learning?. ICML, 2024 Paper, GitHub

  • Pierre Marion, Yu-Han Wu, Michael E. Sander, Gérard Biau. Implicit regularization of deep residual networks towards neural ODEs. ICLR, 2024 (Spotlight). Paper, GitHub

  • Michael E. Sander, Joan Puigcerver, Josip Djolonga, Gabriel Peyré, Mathieu Blondel. Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective. ICML, 2023. Paper, GitHub

  • Michael E. Sander, Pierre Ablin, Gabriel Peyré. Do Residual Neural Networks discretize Neural Ordinary Differential Equations? NeurIPS, 2022. Paper, GitHub

  • Samy Jelassi, Michael E. Sander, Yuanzhi Li. Vision Transformers provably learn spatial structure. NeurIPS, 2022. Paper

  • Michael E. Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyré. Sinkformers: Transformers with Doubly Stochastic Attention. AISTATS, 2022. Paper, GitHub, short presentation

  • Michael E. Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyré. Momentum Residual Neural Networks. ICML, 2021. Paper, GitHub, short presentation