Dec 6, 2024
11:15am - 11:30am
Hynes, Level 2, Room 210
Bowen Deng1,Yunyeong Choi1,Peichen Zhong1,Janosh Riebesell2,Shashwat Anand3,Zhuohan Li3,KyuJung Jun1,Kristin Persson1,3,Gerbrand Ceder1,3
University of California, Berkeley1,University of Cambridge2,Lawrence Berkeley National Laboratory3
Bowen Deng1,Yunyeong Choi1,Peichen Zhong1,Janosh Riebesell2,Shashwat Anand3,Zhuohan Li3,KyuJung Jun1,Kristin Persson1,3,Gerbrand Ceder1,3
University of California, Berkeley1,University of Cambridge2,Lawrence Berkeley National Laboratory3
Artificial Intelligence is increasingly shifting the paradigm of materials discovery. One of the major contributions came from machine learning interatomic potentials (MLIPs), which enabled the chance to scale atomic-level quantum chemical accuracy to large-scale simulations. Recent advancements have seen the emergence of universal MLIPs (uMLIPs) that are pre-trained on diverse materials datasets, providing opportunities for both ready-to-use universal force fields and robust foundations for downstream machine learning refinements. However, the performance of uMLIPs in extrapolating to out-of-distribution (OOD) complex atomic environments remains unclear.<br/> <br/>In this talk, we will discuss the limitations and potential improvements of current foundational uMLIPs including M3GNet, CHGNet and MACE-MP-0 through a series of OOD benchmark tests including surfaces, defects, phonons, ion migration barriers, etc. We uncovered a systematic potential energy surface (PES) softening effect characterized by the underprediction of energy and forces in all benchmark tests with all current uMLIPs. We demonstrate that the PES softening issue can be effectively rectified by fine-tuning with a single additional data point. Our findings suggest that a considerable fraction of uMLIP errors are highly systematic, and can therefore be efficiently corrected. Our result provides a theoretical foundation for the widely observed data-efficient performance boosts achieved by fine-tuning uMLIPs and highlights the advantage of next-generation atomic modeling with large and comprehensive foundational AI models.