Adaptive Load-Balancing for Force-Decomposition Based 3-Body Molecular Dynamics Simulations in A Heterogeneous Distributed Environment with Variable Number of Processors

JV Sumanth and DR Swanson and H Jiang, 2007 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPP), 196-205 (2007).

Molecular Dynamics (MD), a computationally intensive problem, is used by researchers in various fields. The computational parallelism inherent in this application can be exploited in parallel and distributed environments. However in heterogeneous distributed environments such as the Grid, the available resources, namely the network and computational power, are continually changing with respect to every available node. To optimally utilize these dynamic resources, a scheduler should be able to continually adapt to the changes and suitably vary the number of interactions scheduled to each node. We propose one such scheduling algorithm in this paper MD simulations based on the spatial- decomposition (for short-range potentials) technique assuming heterogeneous compute power and homogeneous links exist in the literature. To the best of our knowledge, this paper is the first to perform a block-level decomposition of the force-matrix for three-body potentials in a distributed environment with heterogeneous compute power with heterogeneous net-work links while exploiting the symmetries that exist in a three-body force matrix. Our previous work 24 targeted MD simulations using the Atom-Decomposition Method (Slice level decomposition of the force-matrix) in a heterogeneous environment. The proposed scheduling algorithm builds and continually updates a model of the distributed system, which it then uses to make decisions about how to optimally redistribute the load in the system at every time step of the MD simulation. The scheduling algorithm can additionally handle dynamic changes in the number of nodes available for computation at runtime. We implement our algorithm and evaluate its effectiveness by measuring the idle fraction which is a measure of the idle time experienced by all compute clients at every time-step. This idle fraction is a load-balance optimality measure that indicates how close the load balancing is to the theoretical optimal of 0%. We find that under most typical conditions, it is roughly, 6%. We also determine potential enhancements to improve the idle fraction further.

Return to Publications page