Batch Normalization and the impact of batch structure on the behavior of deep convolution networks.

RSS Source
Mohamed Hajaj, Duncan Gillies

Batch normalization was introduced in 2015 to speed up training of deepconvolution networks by normalizing the activations across the current batch tohave zero mean and unity variance. The results presented here show aninteresting aspect of batch normalization, where controlling the shape of thetraining batches can influence what the network will learn. If training batchesare structured as balanced batches (one image per class), and inference is alsocarried out on balanced test batches, using the batch's own means andvariances, then the conditional results will improve considerably. The networkuses the strong information about easy images in a balanced batch, andpropagates it through the shared means and variances to help decide theidentity of harder images on the same batch. Balancing the test batchesrequires the labels of the test images, which are not available in practice,however further investigation can be done using batch structures that are lessstrict and might not require the test image labels. The conditional resultsshow the error rate almost reduced to zero for nontrivial datasets with smallnumber of classes such as the CIFAR10.

Stay in the loop.

Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.