December 1 - 6, 2024
Boston, Massachusetts
Symposium Supporters
2024 MRS Fall Meeting & Exhibit
BI01.11.01

Advancing Open Science Through “DFT-ML” Tools for Materials Discovery

When and Where

Dec 5, 2024
10:00am - 10:30am
Sheraton, Second Floor, Constitution B

Presenter(s)

Co-Author(s)

Arun Kumar Mannodi-Kanakkithodi1

Purdue University1

Abstract

Arun Kumar Mannodi-Kanakkithodi1

Purdue University1
Typical materials discovery endeavors involve navigating a combinatorial atom-composition-structure space to efficiently optimize multiple desired properties at once. Today, leading materials researchers regularly utilize high-throughput computations and experiments within an autonomous and automated framework, combined with state-of-the-art data science or artificial intelligence approaches. In the Mannodi research group at Purdue University, we perform data-driven discovery of semiconductors for optoelectronic applications such as photovoltaics and photocatalysis, using high-throughput density functional theory (DFT) computations and machine learning (ML) algorithms [1,2,3]. “DFT-ML” predictive models, rigorously optimized on datasets of 10<sup>3</sup> – 10<sup>4</sup> points, enable prediction and screening over &gt; 10<sup>6</sup> possible materials, orders of magnitude faster than a full computational or experimental approach. Such models are trained in a multi-fidelity manner [2] including many levels of theory and even experimental data, and within an active learning framework such that new computations are systematically performed to reduce prediction uncertainties and obtain the most promising compounds in terms of stability, defect tolerance, and optoelectronic properties.<br/> <br/>Given the importance of training the next generation of researchers in the vital skills required for data-driven materials discovery, the aforementioned projects have been converted into multiple user-friendly tools on Github and nanoHUB—an online science gateway housed at Purdue [4]. These tools are powered by Jupyter notebooks that store all the DFT data and enable their easy visualization, contain all code necessary for training and examining ML predictions, and enable easy predictions on new data points. Our goal is to ensure that all our data and models are Findable, Accessible, Interoperable, and Reusable (FAIR) [5], which is critical for advancing research and facilitating collaboration within the scientific community. Specifically, we develop a comprehensive workflow that utilizes nanoHUB’s Sim2Ls framework [6] to systematically parse DFT calculations and store them in a universally indexed database. This database is designed to be easily queried via a Python-based API, which simplifies data access and manipulation for researchers. Moreover, by integrating ML predictive models both as scripts and graphical user interfaces (GUIs), our workflow enables rapid and accurate predictions of key material properties.<br/> <br/>In this presentation, I will discuss how we utilize the above tools for (a) discovering novel halide perovskites for optoelectronic applications and accelerating prediction of defect properties in technologically-important semiconductors, (b) sharing data and models with the community, welcoming engagement, reducing duplication of efforts, and driving future collaborations, and (c) education purposes, specifically for hands-on tutorials organized on behalf of MRS at spring and fall meetings as well as online via nanoHUB, and as exercise material in graduate courses on materials modeling and informatics. Our workflows are dynamic with new data and capabilities added regularly, and are currently being expanded to multiple materials classes and energy-relevant applications.<br/><br/><b>References</b><br/>[1] J. Yang et al., Digital Discovery. 2, 856-870 (2023).<br/>[2] J. Yang et al., J. Chem. Phys. 160, 064114 (2024).<br/>[3] M.H. Rahman et al., APL Machine Learning. 2, 016122 (2024).<br/>[4] K. Madhavan et al., Nanotechnology Reviews, vol. 2, no. 1, pp. 107–117 (2013).<br/>[5] L. C. Brinson et al., MRS Bulletin, 49, 12-16 (2024).<br/>[6] M. Hunt et al., PLOS ONE, 17, 3 (2022).

Symposium Organizers

Deepak Kamal, Solvay Inc
Christopher Kuenneth, University of Bayreuth
Antonia Statt, University of Illinois
Milica Todorović, University of Turku

Session Chairs

Maria Chan
Christopher Kuenneth

In this Session