Molecular Latent Space Simulators for Distributed and Multimolecular Trajectories

MS Jones and ZA McDargh and RP Wiewiora and JA Izaguirre and HF Xu and AL Ferguson, JOURNAL OF PHYSICAL CHEMISTRY A, 127, 5470-5490 (2023).

DOI: 10.1021/acs.jpca.3c01362

All atom molecular dynamics (MD) simulations offer apowerful toolfor molecular modeling, but the short time steps required for numericalstability of the integrator place many interesting molecular eventsout of reach of unbiased simulations. The popular and powerful Markovstate modeling (MSM) approach can extend these time scales by stitchingtogether multiple short discontinuous trajectories into a single long-timekinetic model but necessitates a configurational coarse- graining ofthe phase space that entails a loss of spatial and temporal resolutionand an exponential increase in complexity for multimolecular systems.Latent space simulators (LSS) present an alternative formalism thatemploys a dynamical, as opposed to configurational, coarse grainingcomprising three back-to-back learning problems to (i) identify themolecular system's slowest dynamical processes, (ii) propagatethe microscopic system dynamics within this slow subspace, and (iii)generatively reconstruct the trajectory of the system within the molecularphase space. A trained LSS model can generate temporally and spatiallycontinuous synthetic molecular trajectories at orders of magnitudelower cost than MD to improve sampling of rare transition events andmetastable states to reduce statistical uncertainties in thermodynamicand kinetic observables. In this work, we extend the LSS formalismto short discontinuous training trajectories generated by distributedcomputing and to multimolecular systems without incurring exponentialscaling in computational cost. First, we develop a distributed LSSmodel over thousands of short simulations of a 264-residue proteolysis-targetingchimera (PROTAC) complex to generate ultralong continuous trajectoriesthat identify metastable states and collective variables to informPROTAC therapeutic design and optimization. Second, we develop a multimolecularLSS architecture to generate physically realistic ultralong trajectoriesof DNA oligomers that can undergo both duplex hybridization and hairpinfolding. These trajectories retain thermodynamic and kinetic characteristicsof the training data while providing increased precision of foldingpopulations and time scales across simulation temperature and ionconcentration.

Return to Publications page