Workshop: Deep Learning Enabled Unsupervised State Identification in KRAS Dimers and Interacting Lipids
Abstract: Mutations of the KRAS gene are a prevalent driver in nearly 30% of all human cancers. It is hypothesized that the KRAS dimerization may facilitate RAF clustering, which is known to be required for RAF activation . A recent study  has demonstrated the requirement of KRAS dimerization to sustain the oncogenic function of mutant KRAS and reveals that the disruption of dimerization could be a potential therapeutic strategy that may be effective in KRAS mutant cancers. Consequently, it is crucial to identify the stable states of KRAS dimers. Identifying conformational states in KRAS monomers can be achieved using hand engineered features, such as tilt and rotation angles . However, a similar approach is challenging in KRAS dimers because the complexity of the structure makes it a nontrivial task to identify the right set of angles for adequate feature representation.
We present our ongoing efforts on the state identification in KRAS dimers and associated lipids using ML-based models. In particular, with the advent of deep learning based approaches, it has become possible to use the raw coordinates of molecules to identify their states and the corresponding stability [4,5]. Our initial experiments support the hypothesis that state identification can be reduced to an unsupervised clustering problem in a latent space, which represents the underlying manifold governing the data distribution. We employ variational autoencoders using deep neural networks  to encode the molecular coordinates of the KRAS dimer and the interacting lipid density fields into a meaningful latent space.
Our training data comes from a massive simulation campaign using our MuMMI framework  that was run on up to 4000 nodes of the Sierra supercomputer at LLNL. Using data from 120,000 coarse-grained MD simulations each over 1 microsecond long, we train the neural network using NVIDIA Tesla V100 GPUs to reduce the spatial coordinates into a low-dimensional latent space. We perform spectral clustering in the latent space to determine, in an unsupervised manner, the number of distinct clusters—each corresponding to a distinct state of the RAS dimer. Identified distinct clusters are then analyzed further for their geometrical information and lifetime to understand their stability. Moreover, this information is in the context of different lipid compositions, so the lipid-dependence of the KRAS dimerization can also be tested.
We believe that meaningful state identification in KRAS dimers and associated lipids using ML-based unsupervised methods can provide key insights for experiments facilitating therapeutic strategies. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-790038-DRAFT
 H. Lavoie and M. Therrien, Nat. Rev. Mol. Cell Biol., 16, 2015  C. Ambrogio et al., Cell, 172, 2018  T. Travers et al., Sci. Rep., 8, 2018  F. Noe et al., Science, 365, 2019  N. W. A. Gebauer, M. Gastegger, and K. T. Schütt, NIPS 2018  C. Doersch, arXiv:1606.05908, 2016  F. Di Natale et al., SC, 2019