Using a Supercomputer to Hunt Malware
Wednesday, 20 November 2019 1:30pm - 2pm
The internet is a vast, continuously growing resource where, given its dynamic nature, an arbitrarily large amount of new content is introduced every day. Over the past decade, cybercriminals have learned how to leverage the Internet for malicious intent. Modern cybersecurity solutions need to quickly and accurately determine the security risk of visiting a website by classifying the complex information found within long sequences of data.

Effective machine learning solutions handle this problem by learning patterns from arbitrarily long input sequences in a way that captures signals correlated to malicious activity. These sequence modeling approaches are computationally expensive, requiring either large amounts of memory to individually keep track of a huge number of relevant subsequence fragments, or expansive computing power to iteratively evaluate and update networks that model the sequence as a whole.

We’ve successfully used the resources at the San Diego Supercomputer Center (SDSC) to address these difficult computational challenges, making significant advances for cybersecurity solutions. We will walk conference attendees through our process and share real-life examples. For instance, the unique resources at SDSC made possible the training of our Real-Time Anti-Phishing model, which requires rapid random access across nearly a terabyte of data. The high-memory computing resources allow us to accelerate the random access bottleneck to speed up the training process several orders of magnitude, from days to minutes.
