Jan Janssen1,Danny Perez1
Los Alamos National Laboratory1
Jan Janssen1,Danny Perez1
Los Alamos National Laboratory1
Classical molecular dynamics (MD) is in principle an ideal tool to investigate the long-time<br/>evolution of materials, as ab initio-based MD simulations remain limited to very short time. While<br/>modern machine learning MD potentials report errors on the order 1 meV/atom, these errors<br/>are only typical of configurations that are similar to those found in the training set used to fit the<br/>potential, and transferability to genuinely new configurations remains limited. This poses a<br/>challenge to the accuracy of long-time MD simulations for two reasons: i) transition rates are<br/>exponentially sensitive to energy barriers, and ii) saddle configurations form a very small subset<br/>of the whole configuration space and so are very unlikely to appear in traditional hand-crafted<br/>datasets, or even as part of conventional active-learning approaches based on MD.<br/><br/>We propose a large-scale automated workflow to develop and validate transferable machine<br/>learning potentials for long-time simulations. Starting from an information-entropy optimized<br/>training set with over 7 million atomic environments, fitted potentials are benchmarked on a very<br/>large set of transition states to characterize their transferability. We also assess different practical<br/>strategies for enriching the training set so as to improve the accuracy for long-timescale<br/>simulations. The workflow is developed using the pyiron integrated development environment<br/>for computational materials science and executed with the Exaalt infrastructure.