Apr 24, 2024
11:30am - 11:45am
Room 322, Level 3, Summit
Victor Fung1,Shuyi Jia1,Fan Shu1,Akaash Parthasarathy1,Chandreyi Chakraborty1
Georgia Institute of Technology1
Victor Fung1,Shuyi Jia1,Fan Shu1,Akaash Parthasarathy1,Chandreyi Chakraborty1
Georgia Institute of Technology1
Pre-training of machine learning models, particularly in the form of self-supervised learning, is now an ubiquitous approach for improving model performance and robustness which has featured prominently in the fields of natural language processing and computer vision. To apply these similar concepts to the materials sciences, new domain-aware pre-training strategies need to be developed. We introduce a series of physics-informed pre-training strategies which align well to materials data and can be applied towards the widely used graph neural network class of machine learning models. We demonstrate the effectiveness of this approach on a wide range of benchmarks across multiple materials systems and properties. We discuss the benefits of pre-training for situations where datasets are limited in size and robust out-of-distribution performance is needed, as well as potential implications of pre-training towards developing foundational models for the materials sciences.