Dec 2, 2024
10:45am - 11:00am
Hynes, Level 2, Room 210
Wesley Reinhart1,Antonia Statt2
The Pennsylvania State University1,University of Illinois at Urbana-Champaign2
Wesley Reinhart1,Antonia Statt2
The Pennsylvania State University1,University of Illinois at Urbana-Champaign2
Self-assembly of macromolecules is a critical phenomenon in both technology and biology, with sequence-defined molecules offering significant tunability. However, the vast number of possible sequences poses a challenge for evaluation. In previous work, we have explored the self-assembly of model copolymers using unsupervised representation learning and Recurrent Neural Networks (RNNs). While successful in solving the design problem through high-throughput screening, this approach requires a large corpus of training data to be effective.<br/><br/>Recent advancements in Large Language Models (LLMs) have shown promise in optimization tasks. Here, we demonstrate the ability of a LLM to perform evolutionary optimization for materials discovery. Anthropic’s Claude 3 Opus model significantly outperforms both an active learning scheme with handcrafted surrogate models and an evolutionary algorithm in selecting monomer sequences to produce targeted morphologies in macromolecular self-assembly. The model can perform this task effectively with or without context about the task itself, but domain-specific context improves performance when there are no other hints about the solution. Furthermore, when this context is withheld, the model can infer an approximate notion of the task (e.g., calling it a protein folding problem). This work provides evidence of Claude 3’s ability to act as an evolutionary optimizer, a recently discovered emergent behavior of LLMs, and demonstrates a practical use case in the study and design of soft materials.