December 1 - 6, 2024
Boston, Massachusetts
Symposium Supporters
2024 MRS Fall Meeting & Exhibit
MT01.01.07

Enhancing Machine Learning Interatomic Potentials with Multi-Fidelity Training

When and Where

Dec 2, 2024
4:15pm - 4:30pm
Hynes, Level 2, Room 205

Presenter(s)

Co-Author(s)

Jaesun Kim1,Jisu Kim1,Jaehoon Kim1,Jiho Lee1,Seungwu Han1,2

Seoul National University1,Korea Institute of Advanced Study2

Abstract

Jaesun Kim1,Jisu Kim1,Jaehoon Kim1,Jiho Lee1,Seungwu Han1,2

Seoul National University1,Korea Institute of Advanced Study2
Machine learning interatomic potentials (MLIPs) leverage machine learning techniques to approximate energy, force, and stress values obtained from <i>ab initio</i> calculations. The MLIPs are particularly valued for their ability to provide quantum mechanical precision at significantly lower computational costs. As a data-driven approach, the fidelity of MLIPs is dictated by the choice of the exchange-correlation functional used in building the <i>ab initio</i> training set. While the semilocal functional such as GGA is favored for its high speed and decent accuracy in many physical properties, some applications require higher-fidelity functionals for predictive simulations. For example, in studies of ion conductivity in solid-state electrolytes, meta-GGA is preferred because conventional GGA functionals overestimate lattice parameters, resulting in exaggerated ion conductivities. However, constructing a high-fidelity dataset for an MLIP is challenging due to the high computational costs.<br/><br/>A viable strategy to address this is to utilize abundant low-fidelity data, benefiting from the high correlations in data between different exchange-correlation functionals. In this presentation, we introduce an MLIP framework capable of training on multi-fidelity databases simultaneously, which allows for learning high-fidelity potential energy surfaces (PES) using only a small set of high-fidelity data. We implement the multi-fidelity framework into SevenNet [1] an equivariant graph neural network. Specifically, it employs shared parameters to capture the general trends of PES that exist across datasets, alongside fidelity-specific weights that capture finer details in PES variations between the functionals.<br/><br/>We demonstrate the multi-fidelity training with examples of the In<i><sub>x</sub></i>Ga<sub>1-<i>x</i></sub>N alloy and the argyrodite Li<sub>6</sub>PS<sub>5</sub>Cl systems. We generate a low-fidelity database from configurations of strained crystals and <i>ab initio</i> molecular dynamics (MD) simulations using the PBE functional. Then, we select a small portion of structures in the low-fidelity database and perform one-shot calculations employing the r<sup>2</sup>SCAN functional. We find that chemical environments not directly sampled in the high-fidelity database can be effectively inferred from the low-fidelity data. For example, in the In<i><sub>x</sub></i>Ga<sub>1-<i>x</i></sub>N system, the interactions between In and Ga captured in the low-fidelity dataset can be transferred to enhance the high-fidelity MLIP, resulting in accurate predictions for the ternary alloy even when using only binary crystals for the high-fidelity database. In the Li<sub>6</sub>PS<sub>5</sub>Cl system, the incorporation of high-force configurations from the low-fidelity data significantly improved the MD stability and lithium-ion conductivity predicted by the high-fidelity MLIP.<br/><br/>Furthermore, we develop a multi-fidelity universal MLIP, utilizing both PBE and r<sup>2</sup>SCAN databases in the Materials Project. Our approach significantly enhances the accuracy of the MLIP for the high-fidelity channel, enabling accurate predictions of properties such as polymorph energy ordering, equilibrium volumes, and bulk moduli of crystal systems. However, the multi-fidelity MLIP has shown a tendency to underestimate forces, leading to softer phonon spectra. This is attributed to the narrow range of forces around zero in the high-fidelity dataset. We believe that the present methodology holds promise for efficiently creating highly accurate, universal MLIPs by effectively expanding the high-fidelity dataset.<br/><br/>References<br/>[1] Y. Park, J. Kim, S. Hwang, and S. Han, J. Chem. Theory Comput. 20, 4857−4868 (2024)

Symposium Organizers

MIkko Alava, NOMATEN Center of Excellence
Joern Davidsen, University of Calgary
Kamran Karimi, National Center for Nuclear Research
Enrique Martinez, Clemson University

Session Chairs

Jun Song

In this Session