SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Abstract: Facial recognition is a tractable problem today because of the prevalence of Deep Learning implementations. Approaches for creating structured datasets from unstructured web data are more easily accessible as are GPUs that deep learning frameworks can use to learn from this data. In DARPA’s MEMEX effort, which sought to create better search capabilities for law enforcement to scan the deep and dark web, we are interested in leveraging the Tensorflow framework to reproduce a seminal Deep Learning facial recognition model called VGG- Face. On MEMEX we desired to build the VGG-Face model and to train feature extraction for use in prioritization of leads for possible law enforcement follow-up. We describe our efforts to recreate the VGG-Face dataset, along with our efforts to create the Deep Learning network implementation for it using Tensorflow. Though other implementations of VGG-Face on Tensorflow exist, none of them fully reproduce as much of the dataset as we do today (∼ 48% of the data still exists), nor have detailed documentation and steps for reproducing each step in the workflow. We contribute those instructions and leverage Texas Advanced Computing Center’s Maverick2 supercomputer to perform the work. We report experimental results on building the dataset, and training the network to achieve a 77.99% validation accuracy on the 2, 622 celebrity use case from VGG-Face. This paper can be a useful recipe in building new Tensorflow facial recognition.






Back to Deep Learning on Supercomputers Archive Listing


Back to Full Workshop Archive Listing