Paper ID | D5-S5-T3.1 |
Paper Title |
Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data |
Authors |
Deepesh Data, Suhas Diggavi, University of California, Los Angeles, United States |
Session |
D5-S5-T3: Optimization |
Chaired Session: |
Friday, 16 July, 23:20 - 23:40 |
Engagement Session: |
Friday, 16 July, 23:40 - 00:00 |
Abstract |
We study distributed stochastic gradient descent (SGD) in the master-worker architecture under Byzantine attacks. We consider the heterogeneous data model, where different workers may have different local datasets, and we do not make any probabilistic assumptions on data generation. At the core of our algorithm, we use the polynomial-time outlier-filtering procedure for robust mean estimation proposed by Steinhardt et al.\ (ITCS 2018) to filter-out corrupt gradients. In order to be able to apply their filtering procedure in our {\em heterogeneous} data setting where workers compute {\em stochastic} gradients, we derive a new matrix concentration result, which may be of independent interest. We provide convergence analyses for smooth strongly-convex and non-convex objectives and show that our convergence rates match that of vanilla SGD in the Byzantine-free setting. In order to bound the heterogeneity, we assume that the gradients at different workers have bounded deviation from each other, and we also provide concrete bounds on this deviation in the statistical heterogeneous data model.
|