A Generative Initialization Strategy for Genetic Algorithms for Crystal Structure Prediction

When and Where

Apr 8, 2025
1:45pm - 2:00pm

Summit, Level 4, Room 423

Presenter(s)

Sam Dong

Ajinkya Hire

Jason Gibson

Richard Hennig

Co-Author(s)

Sam Dong¹,Ajinkya Hire¹,Jason Gibson¹,Richard Hennig¹

University of Florida¹

Abstract

Sam Dong¹,Ajinkya Hire¹,Jason Gibson¹,Richard Hennig¹

University of Florida¹

In the past decade, the increase in computational power has enabled crystal structure prediction (CSP) methodologies such as genetic algorithms (GAs) to identify novel, thermodynamically stable materials. More recently, the emergence of large materials databases has allowed generative machine learning (ML) to generate novel, energetically favorable materials by sampling the latent space of a trained deep learning model. However, both methods of CSP have their challenges. The exploratory capabilities of generative ML can be greatly limited by the quality of their training data, which can lead to overfitting and the generation of redundant materials. Exploration is not limited in a GA as they can explore multiple areas within a search space simultaneously. However, GAs for CSP are computationally intensive due to their reliance on first-principle calculations and have difficulty converging on optimal solutions. It has been shown in GAs that population initialization is crucial for algorithm performance. Current initialization strategies in GAs for CSP use some form of random initialization with built-in physical heuristics. Even with these heuristics, random initialization can lead to structures far from their relaxed state, leading to increased computational run times. An initial population that consists of poor structures can also hurt proper convergence. We introduce a generative initialization strategy, where a deep learning model trained on energetically favorable structures creates a stronger starting population by making a better initial guess on structures that lie near minima on the potential energy landscape. Our data-driven approach has shown improvement over random initialization in two facets. First, we show that generative initialization creates individual starting structures that lie closer to local energy minima, effectively reducing the computational cost of individual relaxation calculations. Second, we show that a generative initialization strategy improves convergence by discovering thermodynamically stable materials at over 150 percent the rate compared to random initialization. The resulting impact of this data-driven approach will be more efficient and robust GAs for CSP, leading to the accelerated discovery of novel, thermodynamically stable structures.

Symposium Organizers

Ling Chen, Toyota North America

Bin Ouyang, Florida State University

Chris Bartel, University of Minnesota

Eric McCalla, McGill University

Symposium Support

Bronze
GE Vernova's Advanced Research Center

Session Chairs

Chris Bartel

Bin Ouyang

Symposium Supporters

2025 MRS Spring Meeting & Exhibit