Language Model-Based Generative Model for Catalyst Discovery

When and Where

Dec 2, 2024
10:30am - 10:45am

Hynes, Level 2, Room 210

Presenter(s)

Dong Hyeon Mok

Seoin Back

Co-Author(s)

Dong Hyeon Mok¹,Seoin Back¹

Sogang University¹

Abstract

Dong Hyeon Mok¹,Seoin Back¹

Sogang University¹

Recent advancements have shown that autoregressive language models can effectively generate inorganic crystal structures. Motivated by these advancements, we explored the potential of using a language model as a generative model for inorganic catalyst structures that include surface and adsorbate atoms, as well as a discovery tool for novel and promising electrocatalysts. We trained a language model based on the GPT architecture with 2M catalyst structures sourced from OC database of Meta, enabling the generation of string representations of catalyst structures. Since the validity metrics used in crystal structure generative models are not fully applicable to catalysts, we developed new validity metrics specialized for catalyst structures. The trained model struggles to prevent the generation of invalid structures containing overlapping atoms, while validly generated structures have high-quality. We addressed this issue by introducing a simple method to bypass overlapping atoms, which effectively prevented structurally invalid generations without compromising generation quality. Furthermore, we fine-tuned the model with our own data for the discovery of two-electron oxygen reduction reaction (2e-ORR) catalysts. Despite the relatively small size of the dataset (about 1,500 data points), the fine-tuned model successfully learned the intrinsic rules of the dataset and maintained high-quality catalyst generation. From the fine-tuned model, we discovered five novel and promising 2e-ORR catalyst candidates. In conclusion, our work not only highlights the autoregressive language model's robust catalyst generative performance but also its practical application in electrocatalyst design.

Keywords

inorganic

Symposium Organizers

Kjell Jorner, ETH Zurich

Jian Lin, University of Missouri-Columbia

Daniel Tabor, Texas A&M University

Dmitry Zubarev, IBM

Session Chairs

Kjell Jorner

Jian Lin

Symposium Supporters

2024 MRS Fall Meeting & Exhibit