Dec 5, 2024
4:30pm - 4:45pm
Hynes, Level 2, Room 209
Armi Tiihonen1,Louis Filstroff2,Petrus Mikkola1,Emma Lehto1,Samuel Kaski1,3,Milica Todorović4,Patrick Rinke1
Aalto University1,Université Lille2,The University of Manchester3,University of Turku4
Armi Tiihonen1,Louis Filstroff2,Petrus Mikkola1,Emma Lehto1,Samuel Kaski1,3,Milica Todorović4,Patrick Rinke1
Aalto University1,Université Lille2,The University of Manchester3,University of Turku4
Bayesian Optimization (BO) is a machine learning method that can be used for guiding autonomous materials optimization experiments, such as perovskite composition optimization for more stable solar cells. BO usually optimizes the target property in a sample-efficient manner, which makes it a popular choice especially for costly experiments. However, coupling BO with fully or semi-automated sample preparation introduces the challenge of ensuring a sufficient sample quality during the material optimization loop: Low-quality samples could obscure the optimization process but they can be difficult and sometimes expensive to detect automatically. Such sample quality concerns currently hinder automated materials optimizations. This holds especially in largely unexplored materials domains – where unexpected types of low-quality samples may appear – and in high-dimensional optimizations – that would require a high level of automation to reach the necessary sample sizes.<br/><br/>To make BO-guided materials optimization more robust, we add humans into the BO loop (HITL) as an additional data source to comment on the sample quality. Humans can make sample quality estimations in a flexible way either visually or by prescribing additional characterizations. Humans are in our approach queried only when necessary to minimize the burden to humans. The code implementation is such that the human feedback can also be given asynchronous to the experiment loop to avoid humans from becoming bottlenecks of the optimization.<br/><br/>We implemented three HITL schemes, two based on data fusion via an added cost to the BO acquisition function, and one on multifidelity BO using Bayesian Optimization Structure Search (BOSS) package [1]. We tested these approaches in realistic simulations built on previously obtained perovskite experiments [2], where the low-quality samples consisted predominantly of samples crystallized into non-photoactive phases of material.<br/><br/>We show by simulated BO benchmarks that HITL is a straightforward yet effective way to gain information on the regions with low-quality samples and to avoid the BO search from converging into those regions, with a reasonable added effort from humans: Our data fusion HITL BO method queried on average 7% of the samples from humans when the BO run was repeated 25 times. This lead to on average only 2% of samples drawn from the low-quality region, in contrast to the 25% with the reference method without humans. Thus, HITL makes the BO approach more robust toward sample quality variations, and provides an opportunity to pursue more complex materials domains with semi-autonomous setups.<br/><br/>References:<br/><br/>[1] Todorović, Milica, et al. "Bayesian inference of atomistic structure in functional materials." Npj computational materials 5.1 (2019): 35.<br/>[2] Sun, Shijing, et al. "A data fusion approach to optimize compositional stability of halide perovskites." Matter 4.4 (2021): 1305-1322.