Optimizing the performance of reactive molecular dynamics simulations for many-core architectures
HM Aktulga and C Knight and P Coffman and KA O'Hearn and TR Shan and W Jiang, INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 33, 304-321 (2019).
DOI: 10.1177/1094342017746221
Reactive molecular dynamics simulations are computationally demanding. Reaching spatial and temporal scales where interesting scientific phenomena can be observed requires efficient and scalable implementations on modern hardware. In this article, we focus on optimizing the performance of the widely used LAMMPS/ReaxC package for many-core architectures. As hybrid parallelism allows better leverage of the increasing on-node parallelism, we adopt thread parallelism in the construction of bonded and nonbonded lists and in the computation of complex ReaxFF interactions. To mitigate the I/O overheads due to large volumes of trajectory data produced and to save users the burden of post-processing, we also develop a novel in situ tool for molecular species analysis. We analyze the performance of the resulting ReaxC-OMP package on two different architectures: (i) Mira, an IBM Blue Gene/Q system and (ii) Cori-II, a Cray XC-40 sytem with Knights Landing processors. For Pentaerythritol tetranitrate (PETN) systems of sizes ranging from 32 thousand to 16.6 million particles, we observe speedups in the range of 1.5-4.5x. We observe sustained performance improvements for up to 262,144 cores (1,048,576 processes) of Mira and a weak scaling efficiency of 91.5% in large simulations containing 16.6 million particles. The in situ molecular species analysis tool incurs only insignificant overheads across various system sizes and runs configurations.
Return to Publications page