Evolving Larger Convolutional Layer Kernel Sizes for a Settlement Detection Deep-Learner on Summit

SC19 Proceedings

Abstract: Deep-learner hyper-parameters, such as kernel sizes, batch sizes, and learning rates, can significantly influence the quality of trained models. The state of the art for finding optimal hyper-parameters generally uses a brute force, grid search approach, random search, or Bayesian-based optimization among other techniques. We applied an evolutionary algorithm to optimize kernel sizes for a convolutional neural network used to detect settlements in satellite imagery. Usually convolutional layer kernel sizes are small – typically one, three, or five – but we found that the system converged at, or near, kernel sizes of nine for the last convolutional layer, and that this occurred for multiple runs using two different datasets. Moreover, the larger kernel sizes had fewer false positives than the 3x3 kernel sizes found as optimal via a brute force uniform grid search. This suggests that this large kernel size may be leveraging patterns found in larger areal features in the source imagery, and that this may be generalized as possible guidance for similar remote sensing deep-learning tasks.

Back to Deep Learning on Supercomputers Archive Listing

Back to Full Workshop Archive Listing