Martin Seifrid1
North Carolina State University1
Martin Seifrid1
North Carolina State University1
Due to the complexity of designing organic semiconducting materials (OSCs) and devices, researchers have long used computational tools to screen candidate molecular structures and gain further insight into physical processes. Machine learning (ML) is emerging as a powerful tool for accelerating this even further. While processing conditions are known to play a significant role in determining PCE, ML has so far only been used to predict the power conversion efficiency (PCE) of organic photovoltaics (OPV) from molecular structure alone<br/><br/>In this talk, I will discuss how we can integrate device fabrication data into ML models, as well as best practices of ML for OPV. One of the key challenges of gathering such datasets is the difficulty of extracting data from the literature. We have created the first dataset containing both molecular structure and device processing conditions. We find sobering evidence of the low quality of such data, which is particularly important if we seek to use ML to accelerate the development of OPVs and OSCs. Many are aware of the widespread problem of the low quality of literature data, which affects numerous fields and is not discussed openly enough. The question for the future of ML and meta-analyses based on datasets derived from the literature becomes: what do we do going forward?<br/><br/>I will initiate a discussion with the community about what we can do to make generating large OPV and OSC datasets easier and more reliable, both in the short and long terms. In particular, I will outline a set of standards that could help make reporting and collecting data more robust.