Dr. Sebastian Stich ist seit dem 1. Dezember 2021 Tenure-Track-Faculty am CISPA Helmholtz-Zentrum für Informationssicherheit. Seit Juni 2020 ist er Mitglied des European Lab for Learning and Intelligent Systems. Vom 1. Dezember 2016 bis zum 30. November 2021 war er Forscher an der EPFL unter der Leitung von Prof. Martin Jaggi im Machine Learning and Optimization Laboratory (MLO). Vom 1. November 2014 bis 31. Oktober 2016 arbeitete er mit Prof. Yurii Nesterov und Prof. François Glineur am Center for Operations Research and Econometrics (CORE) und am ICTEAM. Vom 15. September 2010 bis 30. September 2014 war er Doktorand in der Forschungsgruppe von Prof. Emo Welzl, betreut von Prof. Bernd Gärtner und Christian Lorenz Müller. Von September 2005 bis März 2010 absolvierte er seinen Bachelor und Master in Mathematik an der ETH Zürich.
International Conference on Machine Learning (ICML)
Enhancing LLM Training via Spectral Clipping
Conference on Learning Theory (COLT)
On the Stability of Nonlinear Dynamics in GD and SGD: Beyond Quadratic Potentials
International Conference on Artificial Intelligence and Statistics (AISTATS)
Accelerated Distributed Optimization with Compression and Error Feedback
International Conference on Artificial Intelligence and Statistics (AISTATS)
Accelerated Distributed Optimization with Compression and Error Feedback
International Conference on Learning Representations (ICLR)
Composite Optimization with Error Feedback: the Dual Averaging Approach
International Conference on Learning Representations (ICLR)
FedMuon: Federated Learning with Bias-corrected LMO-based Optimization
International Conference on Learning Representations (ICLR)
Non-Convex Federated Optimization under Cost-Aware Client Selection
International Conference on Learning Representations (ICLR)
Monotone Near-Zero-Sum Games: A Generalization of Convex-Concave Minimax
International Conference on Learning Representations (ICLR)
DADA: Dual Averaging with Distance Adaptation
Conference on Neural Information Processing Systems (NeurIPS)
Revisiting Consensus Error: A Fine-grained Analysis of Local SGD under Second-order Data Heterogeneity