Since December 1st 2021 Dr. Sebastian Stich is a tenure track faculty at CISPA. Since June 2020 he is a member of the European Lab for Learning and Intelligent Systems. From December 1st 2016 to November 30th 2021, he worked as a research scientist at EPFL, hosted by Prof. Martin Jaggi, Machine Learning and Optimization Laboratory (MLO). From November 1st 2014 to October 31st 2016, he worked with Prof. Yurii Nesterov and Prof. François Glineur at the Center for Operations Research and Econometrics (CORE) and the ICTEAM. From September 15th 2010 to September 30th 2014, he was a PHD student in Prof. Emo Welzl's research group, supervised by Prof. Bernd Gärtner and Christian Lorenz Müller. And from September 2005 to March 2010 he did his Bachelor and Master in Mathematics at ETH Zurich.
International Conference on Machine Learning (ICML)
Enhancing LLM Training via Spectral Clipping
Conference on Learning Theory (COLT)
On the Stability of Nonlinear Dynamics in GD and SGD: Beyond Quadratic Potentials
International Conference on Artificial Intelligence and Statistics (AISTATS)
Accelerated Distributed Optimization with Compression and Error Feedback
International Conference on Artificial Intelligence and Statistics (AISTATS)
Accelerated Distributed Optimization with Compression and Error Feedback
International Conference on Learning Representations (ICLR)
Monotone Near-Zero-Sum Games: A Generalization of Convex-Concave Minimax
International Conference on Learning Representations (ICLR)
DADA: Dual Averaging with Distance Adaptation
International Conference on Learning Representations (ICLR)
Composite Optimization with Error Feedback: the Dual Averaging Approach
International Conference on Learning Representations (ICLR)
FedMuon: Federated Learning with Bias-corrected LMO-based Optimization
International Conference on Learning Representations (ICLR)
Non-Convex Federated Optimization under Cost-Aware Client Selection
Conference on Neural Information Processing Systems (NeurIPS)
Revisiting Consensus Error: A Fine-grained Analysis of Local SGD under Second-order Data Heterogeneity