Presentation
Supercharging Digital Pathology AI with Unified Memory and HPC
Speaker
Event Type
HPC Impact Showcase
TP
EX
EXH
TimeThursday, 21 November 20192:30pm - 3pm
Location503-504
DescriptionApplication of AI in digital pathology has attracted increasing attention in recent years.
As the field is in its nascent days, there are numerous challenges in methodology to overcome. One of the main challenges is the extreme spatial dimension of digital whole slide images (WSIs), often larger than billion pixels. Although compute accelerator such as GPU significantly speeds up deep neural network training, its rather limited memory cannot accommodate large image inputs such as WSIs. The most prevalent solution to this problem is patch-based approach that divides WSIs into small image inputs. However, this method has significant drawbacks, most notably the laborious process to perform detailed annotation to provide ground truth for individual patches. Our solution to this problem is to utilize CUDA Unified Memory to allow DNN training on entire WSI inputs. Leveraging the HPC power of TAIWANIA 2, along with our proposed memory optimization techniques, we’re able to train a DNN on 1000 WSIs to reach maximum performance (0.99 AUROC) in 16 hours, with 550x speedup on a total of 128 GPUs. Our experience on the Nasopharyngenl Carcinoma detection case, in collaboration with Chang Gung Memorial Hospital, shows this brand-new training pipeline could significantly shorten the production cycle. By reducing 6 months of annotation efforts, the method only takes 2 weeks to train a DNN model without accuracy loss. Our optimization, combined with HPC, resulted in an easily scalable solution that will greatly facilitate the development of digital pathology AI.
As the field is in its nascent days, there are numerous challenges in methodology to overcome. One of the main challenges is the extreme spatial dimension of digital whole slide images (WSIs), often larger than billion pixels. Although compute accelerator such as GPU significantly speeds up deep neural network training, its rather limited memory cannot accommodate large image inputs such as WSIs. The most prevalent solution to this problem is patch-based approach that divides WSIs into small image inputs. However, this method has significant drawbacks, most notably the laborious process to perform detailed annotation to provide ground truth for individual patches. Our solution to this problem is to utilize CUDA Unified Memory to allow DNN training on entire WSI inputs. Leveraging the HPC power of TAIWANIA 2, along with our proposed memory optimization techniques, we’re able to train a DNN on 1000 WSIs to reach maximum performance (0.99 AUROC) in 16 hours, with 550x speedup on a total of 128 GPUs. Our experience on the Nasopharyngenl Carcinoma detection case, in collaboration with Chang Gung Memorial Hospital, shows this brand-new training pipeline could significantly shorten the production cycle. By reducing 6 months of annotation efforts, the method only takes 2 weeks to train a DNN model without accuracy loss. Our optimization, combined with HPC, resulted in an easily scalable solution that will greatly facilitate the development of digital pathology AI.
Speaker