Neural Network Architecture Optimization through Submodularity and Supermodularity.

RSS Source
Authors
Junqi Jin, Ziang Yan, Kun Fu, Nan Jiang, Changshui Zhang

Deep learning models' architectures, including depth and width, are keyfactors influencing models' performance, such as test accuracy and computationtime. This paper solves two problems: given computation time budget, choose anarchitecture to maximize accuracy, and given accuracy requirement, choose anarchitecture to minimize computation time. We convert this architectureoptimization into a subset selection problem. With accuracy's submodularity andcomputation time's supermodularity, we propose efficient greedy optimizationalgorithms. The experiments demonstrate our algorithm's ability to find moreaccurate models or faster models. By analyzing architecture evolution withgrowing time budget, we discuss relationships among accuracy, time andarchitecture, and give suggestions on neural network architecture design.

Stay in the loop.

Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.