International Conference on Machine Learning Workshop(ICML- W)
HORST: Composing Optimizer Geometries for Sparse Transformer Training
International Conference on Machine Learning workshop (ICML-W)
Continuous Sparsification via Minimizing Movement
International Conference on Machine Learning (ICML)
SparseOpt: Addressing Normalization-induced Gradient Skew in Sparse Training
International Conference on Learning Representations (ICLR)
International Conference on Learning Representations (ICLR)
Never Saddle Down for Reparameterized Steepest Descent as Mirror Flow
International Conference on Learning Representations (ICLR)
Hyperbolic Aware Minimization: Implicit Bias for Sparsity
Conference on Neural Information Processing Systems (NeurIPS)
Pay Attention to Small Weights
Conference on Neural Information Processing Systems (NeurIPS)
The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
International Conference on Machine Learning (ICML)
Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?
Conference on Neural Information Processing Systems (NeurIPS)