Raku Shirasawa1,Ichiro Takemura1,Shinnosuke Hattori1,Yuuya Nagata2
Sony Group Corporation1,Hokkaido University2
Raku Shirasawa1,Ichiro Takemura1,Shinnosuke Hattori1,Yuuya Nagata2
Sony Group Corporation1,Hokkaido University2
A semi-automatic molecular exploration scheme to model the solvent-solubility of tetraphenylporphyrin derivatives (TPPs) was designed and implemented. This scheme consisted of the following steps: (1) defining a practical chemical search space, (2) prioritizing molecules in the space using an extended algorithm for submodular function maximization (SFMMOL) without requiring biased variable selection or pre-existing data, (3) synthesis and automatic measurement, and (4) estimating machine-learning (ML) model. The evaluation order of TPPs selected using SFMMOL covered several similar molecules (32% of all targeted molecules, whereas that obtained by random sampling and uncertainty sampling was ~7% and ~4%, respectively) even with a small number of evaluations (10 molecules: 0.13% of all targeted molecules). The top 5 molecules in the order were synthesized, and, in conjunction with commercially available molecules, totally 15 molecules in 16 solvents were evaluated by the automatic measurement of UV-Vis absorption spectra. The derived binary ML classifiers for solute-solvent pair predicted ‘good solvents’ with an accuracy > 0.8. Consequently, we confirmed the effectivity of the proposed semi-automatic scheme for accelerating the early-stage material search project and expected the scheme applicable to a wider range of material research.