All Dates/Times are Australian Eastern Standard Time (AEST)

Technical Program

Paper Detail

Paper IDD2-S5-T3.1
Paper Title Mutual Information of Neural Network Initialisations: Mean Field Approximations
Authors Jared Tanner, Giuseppe Ughi, University of Oxford, United Kingdom
Session D2-S5-T3: Neural Networks I
Chaired Session: Tuesday, 13 July, 23:20 - 23:40
Engagement Session: Tuesday, 13 July, 23:40 - 00:00
Abstract The ability to train randomly initialised deep neural networks is known to depend strongly on the variance of the weight matrices and biases as well as the choice of nonlinear activation. Here we complement the existing geometric analysis of this phenomenon with an information theoretic alternative. Lower bounds are derived for the mutual information between an input and hidden layer outputs. Using a mean field analysis we are able to provide analytic lower bounds as functions of network weight and bias variances as well as the choice of nonlinear activation. These results show that initialisations known to be optimal from a training point of view are also superior from a mutual information perspective.