December 1 - 6, 2024
Boston, Massachusetts
Symposium Supporters
2024 MRS Fall Meeting & Exhibit
MT04.10.04

MatFold—Cross-validation Protocols to Systematically Evaluate Generalization Errors in Materials Discovery Models

When and Where

Dec 5, 2024
8:45am - 9:00am
Hynes, Level 2, Room 210

Presenter(s)

Co-Author(s)

Peter Schindler1,Matthew Witman2

Northeastern University1,Sandia National Laboratories2

Abstract

Peter Schindler1,Matthew Witman2

Northeastern University1,Sandia National Laboratories2
Machine learning models in materials science validated by a single train/validation/test split or non-nested K-fold splits using a single splitting criterion can yield biased and overly optimistic performance estimates for downstream modeling or materials screening tasks. This can be particularly counterproductive for applications where the time and cost of failed validation efforts (experimental synthesis, characterization, and testing) are consequential. We propose a set of standardized and increasingly difficult splitting protocols for chemically and structurally motivated, nested K-fold cross-validation that can be followed to validate any machine learning model for materials discovery. Among several benefits, this enables systematic insights into model generalizability, improvability, and uncertainty, provides benchmarks for fair comparison between competing models with access to differing quantities of data, and systematically reduces possible data leakage through increasingly strict splitting protocols. A general-purpose toolkit, MatFold, is provided to automate the construction of these chemically motivated train/test splits and facilitate further community use. We employ MatFold to analyze the generalization error of two datasets with distinct model architectures. One dataset contains relaxed vacancy formation energies that are predicted using a graph-convolutional neural network.[1] The other dataset consists of work functions of surfaces predicted utilizing an elemental random forest model.[2] The observed trends in generalization errors and their variances for various MatFold splitting protocols reveal unique scaling behavior for each model architecture.<br/><br/>References:<br/>[1] M. D. Witman, A. Goyal, T. Ogitsu, A. H. McDaniel, S. Lany, <i>Defect graph neural networks for materials discovery in high-temperature clean-energy applications.</i> <b>Nature Computational Science</b> <b>2023</b>, 3, 675–686.<br/>[2] P. Schindler, E. R. Antoniuk, G. Cheon, Y. Zhu, and E. J. Reed, <i>Discovery of Stable Surfaces with Extreme Work Functions by High-Throughput Density Functional Theory and Machine Learning</i><b>. Advanced Functional Materials</b> <b>2024</b>, 34, 19, 2401764<br/><br/>Acknowledgements:<br/>Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administration (DOE/NNSA) under contract DE-NA0003525. This written work is authored by an employee of NTESS. The employee, not NTESS, owns the right, title and interest in and to the written work and is responsible for its contents. Any subjective views or opinions that might be expressed in the written work do not necessarily represent the views of the U.S. Government. The publisher acknowledges that the U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this written work or allow others to do so, for U.S. Government purposes. The DOE will provide public access to results of federally sponsored research in accordance with the DOE Public Access Plan.

Symposium Organizers

Kjell Jorner, ETH Zurich
Jian Lin, University of Missouri-Columbia
Daniel Tabor, Texas A&M University
Dmitry Zubarev, IBM

Session Chairs

Kjell Jorner
Dmitry Zubarev

In this Session