MRS Meetings and Events

 

DS01.12.04 2022 MRS Spring Meeting

Cost-Efficient Training of a Neural Network Potential by Means of Active Learning for Fast and Accurate Molecular Dynamics Simulations

When and Where

May 12, 2022
2:30pm - 2:45pm

Hawai'i Convention Center, Level 3, Lili'U Theater, 310

Presenter

Co-Author(s)

Sung-Ho Lee1,2,Valerio Olevano3,2,Benoit Sklénard1,2

CEA-Leti1,Université Grenoble Alpes2,CNRS, Institut Néel3

Abstract

Sung-Ho Lee1,2,Valerio Olevano3,2,Benoit Sklénard1,2

CEA-Leti1,Université Grenoble Alpes2,CNRS, Institut Néel3
In recent years, machine learning (ML) has been finding more and more applications and has become an important area of research in many domains. In the field of computational physics, atomistic simulations using ML-powered interatomic potentials for materials were shown to overcome some major barriers that are present in traditional methods<sup>1</sup>. Harnessing the power of neural network potentials (NNP), calculations can be performed many orders of magnitude faster than Density Functional Theory (DFT) simulations, with near-<i>ab initio</i> level accuracies and with linear scaling with respect to the system size. This has important implications, namely that simulations that were previously prohibitively expensive for <i>ab initio</i> methods are now practicable.<br/><br/>Obtaining an accurate and generalizable NNP depends heavily on having a sufficiently diverse training dataset. In addition, being able to detect when the neural network is uncertain about a prediction is crucial for molecular dynamics simulations, as the errors will accumulate across the time steps and the simulation will gradually drift farther from reality. To this end, active learning based on Query-by-Committee (QbC)<sup>2</sup> was employed.<br/><br/>Query-by-Committee is a method in which an ensemble of neural networks operate in tandem to give predictions for the same data point, whose level of disagreement can act as a proxy for the degree of uncertainty of the prediction. High disagreement indicates that the data point is out-of-distribution (OOD) which, in the context of NNPs, means that said ionic step contains local environments that are configurationally distinct from those in the training dataset and thus the prediction is unreliable. Once such points have been identified, they are added to the training dataset to improve the generalizability of the neural network and its quality of fit, a process called active learning. By only adding the structures that the NN is less confident about, rather than naively providing a bulk of structures spanning a large configuration space, the size of the dataset and hence the cost of training is kept at a minimum. Furthermore, evaluating the uncertainty of the NNP at every time step provides a measure of the reliability of the simulation.<br/><br/>Here, we present a neural network potential for SiGe, a material that is of interest in many technological applications. The potential was trained with our in-house NNP package based on Behler’s High Dimensional NNP model<sup>1</sup>, capable of training a potential and running molecular dynamics simulations via an interface to LAMMPS<sup>3</sup>. The training was carried out with an Extended Kalman Filter optimizer<sup>4</sup> which was shown to perform exceptionally in terms of both the final loss and training time for NNP applications.<br/> <br/><br/><b>References</b><br/>1. J. Behler,<b> <i>Chemical Reviews</i>, 121</b>, <b>10037 </b>(2021).<br/>2. R. Burbidge, J.J. Rowland, and R.D. King, <b>Intelligent Data Engineering and Automated Learning </b>(2007)<br/>3. S. Plimpton,<b> Journal of Computational Physics, 117, 1-19 (1995)</b><br/>4. S. Singhal and L. Wu,<b> Advances in Neural Information Processing Systems 1 </b>(1988).

Symposium Organizers

Mathieu Bauchy, University of California, Los Angeles
Mathew Cherukara, Argonne National Laboratory
Grace Gu, University of California, Berkeley
Badri Narayanan, University of Louisville

Publishing Alliance

MRS publishes with Springer Nature