2021-04

Do winning tickets exist before DNN training?

Summary

The recent lottery ticket hypothesis proposes that there is at least one sub-network that matches the accuracy of the original network when trained in isolation. Recent work shows that under SGD noise, several such tickets emerge. We build on these works and study how winning tickets derived from one fixed network differ in structural and functional terms under varying levels of stochasticity. Structurally, we show that the Hamming distance of winning tickets' shapes follow the hyper-geometric distribution. Functionally, our experiments validate that different emerging winning tickets are not disguised variants of each other, but diverge also concerning their classification outputs. Last but not least, different regimes of stochasticity affect winning tickets. Decreasing randomness during training also decreases the tickets' functional and structural distance.