Efficient generation of stable linear machine-learning force fields with uncertainty-aware active learning

V Briganti and A Lunghi, MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 4, 035005 (2023).

DOI: 10.1088/2632-2153/ace418

Machine-learning (ML) force fields (FFs) enable an accurate and universal description of the potential energy surface of molecules and materials on the basis of a training set of ab initio data. However, large-scale applications of these methods rest on the possibility to train accurate ML models with a small number of ab initio data. In this respect, active-learning (AL) strategies, where the training set is self-generated by the model itself, combined with linear ML models are particularly promising. In this work, we explore an AL strategy based on linear regression and able to predict the model's uncertainty on predictions for molecular configurations not sampled by the training set, thus providing a straightforward recipe for the extension of the latter. We apply this strategy to the spectral neighbor analysis potential and show that only tens of ab initio simulations of atomic forces are required to generate FFs for room-temperature molecular dynamics at or close to chemical accuracy and which stability can be systematically improved by the user at modest computational expenses. Moreover, the method does not necessitate any conformational pre- sampling, thus requiring minimal user intervention and parametrization.

Return to Publications page