Jeff Xu1,Katherine Young2,Logan Keating1,Ajit Vikram3,Moonsub Shim1,Paul Kenis1
University of Illinois at Urbana-Champaign1,Purdue University2,Merck & Co Inc3
Jeff Xu1,Katherine Young2,Logan Keating1,Ajit Vikram3,Moonsub Shim1,Paul Kenis1
University of Illinois at Urbana-Champaign1,Purdue University2,Merck & Co Inc3
Autonomous workflows have increasingly been implemented across disparate facets of materials research to accelerate materials discovery or materials optimization. One application of autonomous workflows is the generation of accurate parameter space maps of synthesis spaces. These accurate parameter space can then be probed by subject matter experts to generate deeper insight and propose new hypotheses for further research.<br/><br/>Building an accurate parameter space map requires efficient sampling of the parameter space, which can be done effectively with active learning. Active learning algorithms involve both a surrogate model to fit existing data and an acquisition function to decide the next experiment(s) based on some pre-defined heuristic. Presently the selection of surrogate models and acquisition functions relies on human intuition and requires both an understanding of statistical learning fundamentals and the nature of the material systems studied. Increased adoption of active learning, however, requires a set of heuristics regarding the selection of surrogate models and acquisition functions that is beginner friendly: scientists should be able to build effective active-learning algorithms without prior machine learning experience.<br/><br/>This talk will report on a generalized modular active learning approach to combine various surrogate models and acquisition functions for mapping the synthesis parameter space of quantum dot (QD) synthesis. We tested different surrogate models, acquisition functions, and batch sizes using simulated data and digital twins of real experiment data and compared their relative performances in terms of accuracy and computational resources required. We then experimentally applied these active learning algorithms on an automated QD synthesis platform on different synthesis chemistries to test the generalizability of our approach. Results from this endeavor generated useful heuristics for building active learning algorithms that will help future scientists who wish to use machine learning to accelerate their automated online or manual offline workflows. The modular approach for implementing active learning also provides a useful framework for analyzing and assessing new active learning approaches.