Paper ID | D2-S5-T3.2 |
Paper Title |
Self-Regularity of Output Weights for Overparameterized Two-Layer Neural Networks |
Authors |
David Gamarnik, Eren C Kizildag, MIT, United States; Ilias Zadik, NYU, United States |
Session |
D2-S5-T3: Neural Networks I |
Chaired Session: |
Tuesday, 13 July, 23:20 - 23:40 |
Engagement Session: |
Tuesday, 13 July, 23:40 - 00:00 |
Abstract |
We consider the problem of finding a two-layer neural network with sigmoid, rectified linear unit, or binary step activation functions that ``fits" a training data set as accurately as possible as quantified by the training error; and study the following question: \emph{does a low training error guarantee that the norm of the output layer (outer norm) itself is small?} We address this question for the case of non-negative output weights. Using a simple covering number argument, we establish that under quite mild distributional assumptions on the input/label pairs; any such network achieving a small training error on polynomially many data necessarily has a well-controlled outer norm. Notably, our results (a) have a good sample complexity, (b) are independent of the number of hidden units, (c) are oblivious to the training algorithm; and (d) require quite mild assumptions on the data (in particular the input vector $X\in\mathbb{R}^d$ need not have independent coordinates). We then show how our bounds can be leveraged to yield generalization guarantees for such networks.
|