Apr 8, 2025
1:45pm - 2:00pm
Summit, Level 4, Room 423
Sam Dong1,Ajinkya Hire1,Jason Gibson1,Richard Hennig1
University of Florida1
Sam Dong1,Ajinkya Hire1,Jason Gibson1,Richard Hennig1
University of Florida1
In the past decade, the increase in computational power has enabled crystal structure prediction (CSP) methodologies such as genetic algorithms (GAs) to identify novel, thermodynamically stable materials. More recently, the emergence of large materials databases has allowed generative machine learning (ML) to generate novel, energetically favorable materials by sampling the latent space of a trained deep learning model. However, both methods of CSP have their challenges. The exploratory capabilities of generative ML can be greatly limited by the quality of their training data, which can lead to overfitting and the generation of redundant materials. Exploration is not limited in a GA as they can explore multiple areas within a search space simultaneously. However, GAs for CSP are computationally intensive due to their reliance on first-principle calculations and have difficulty converging on optimal solutions. It has been shown in GAs that population initialization is crucial for algorithm performance. Current initialization strategies in GAs for CSP use some form of random initialization with built-in physical heuristics. Even with these heuristics, random initialization can lead to structures far from their relaxed state, leading to increased computational run times. An initial population that consists of poor structures can also hurt proper convergence. We introduce a generative initialization strategy, where a deep learning model trained on energetically favorable structures creates a stronger starting population by making a better initial guess on structures that lie near minima on the potential energy landscape. Our data-driven approach has shown improvement over random initialization in two facets. First, we show that generative initialization creates individual starting structures that lie closer to local energy minima, effectively reducing the computational cost of individual relaxation calculations. Second, we show that a generative initialization strategy improves convergence by discovering thermodynamically stable materials at over 150 percent the rate compared to random initialization. The resulting impact of this data-driven approach will be more efficient and robust GAs for CSP, leading to the accelerated discovery of novel, thermodynamically stable structures.