Batch active learning for accelerating the development of interatomic potentials

N Wilson and D Willhelm and XN Qian and R Arroyave and XF Qian, COMPUTATIONAL MATERIALS SCIENCE, 208, 111330 (2022).

DOI: 10.1016/j.commatsci.2022.111330

Classical molecular dynamics (MD) has been widely used to study atomistic mechanisms and emergent behavior in materials at length and time scales beyond the capabilities of first-principles approaches. The success of classical MD simulations relies on the ability of classical interatomic potentials to accurately map complex many-body interacting systems of electrons and nuclei into effective few-body interacting systems of atoms. In practice, the development of interatomic potentials is a nontrivial process and requires considerable amount of effort. Recently, machine learning has become a promising approach to accelerate interatomic potential development. However, these machine learning approaches are often computation and data intense, as they require a large amount of training data from first-principles calculations, such as total energies, atomic forces, and stress tensors of many atomistic structures. Here we propose an active learning approach combined with first-principles theory calculations to expedite the development of machine learning interatomic potentials. In particular, we develop a batch active learning method which combines both energy uncertainty and structure similarity metrics to efficiently sample the highly uncertain structures that are difficult to predict. This active sampling approach maximizes the utility of the dataset in each batch and generates interatomic potential with highly accurate and robust model coefficients which are difficult to achieve with conventional sampling approaches. To demonstrate this batch active learning method, we develop an active learning potential for monolayer GeSe, a two-dimensional ferroelectric- ferroelastic material, and compare the quality and robustness of the active learning potential with the potential obtained from random sampling. Batch active learning method opens up avenues for accelerating the development of robust and accurate machine learning potential using a small set of atomistic structures which will be valuable for computational materials, physics, and chemistry community.

Return to Publications page