2021 IEEE International Symposium on Information Theory

Technical Program

Paper ID	D2-S5-T3.2
Paper Title	Self-Regularity of Output Weights for Overparameterized Two-Layer Neural Networks
Authors	David Gamarnik, Eren C Kizildag, MIT, United States; Ilias Zadik, NYU, United States
Session	D2-S5-T3: Neural Networks I
Chaired Session:	Tuesday, 13 July, 23:20 - 23:40
Engagement Session:	Tuesday, 13 July, 23:40 - 00:00
Abstract	We consider the problem of finding a two-layer neural network with sigmoid, rectified linear unit, or binary step activation functions that ``fits" a training data set as accurately as possible as quantified by the training error; and study the following question: \emph{does a low training error guarantee that the norm of the output layer (outer norm) itself is small?} We address this question for the case of non-negative output weights. Using a simple covering number argument, we establish that under quite mild distributional assumptions on the input/label pairs; any such network achieving a small training error on polynomially many data necessarily has a well-controlled outer norm. Notably, our results (a) have a good sample complexity, (b) are independent of the number of hidden units, (c) are oblivious to the training algorithm; and (d) require quite mild assumptions on the data (in particular the input vector $X\in\mathbb{R}^d$ need not have independent coordinates). We then show how our bounds can be leveraged to yield generalization guarantees for such networks.