Jacqueline Cole1,2
University of Cambridge1,ISIS Pulsed Neutron and Muon Source2
Jacqueline Cole1,2
University of Cambridge1,ISIS Pulsed Neutron and Muon Source2
This talk will introduce the ‘chemistry aware’ natural processing tool, ChemDataExtractor,[1,2] and illustrate its ability to auto-generate databases on battery and photovoltaic materials and device information. The battery database contains a total of ¼ million data records and includes chemical names of anodes, cathodes and electrolytes as well as five device properties.[3] The photovoltaics database contains ½ million data records and includes chemical names and device properties of perovskite-based and dye-sensitized solar cells.[4] The database auto-generation methods are discussed as is the applicability of these large databases to machine-learning and algorithmic pipelines which enable a design-to-device approach to data-driven materials discovery.[5,6]<br/><b>References</b><br/>[1] Swain and Cole, <i>J. Chem. Inf. Model</i>. 2016, 56, 10, 1894–1904 www.chemdataextractor.org<br/>[2] Mavracic, Court, Isazawa, Elliott, Cole, <i>J. Chem. Inf. Model</i>. 2021, 61, 9, 4280–4289 www.chemdataextractor2.org<br/>[3] Huang, Cole, <i>Sci. Data</i> (Springer Nature), 2020, 7, 260 https://doi.org/10.1038/s41597-020-00602-2<br/>[4] Beard, Cole, <i>Sci. Data</i> (Springer Nature), 2021, submitted.<br/>[5] Cole, <i>Acc. Chem. Res</i>. 2020, 53, 599-610.<br/>[6] Cooper, Beard, Vazquez-Mayagoitia, Stan, Stenning, Nye, Vigil, Tomar, Jia, Bodedla, Chen, Gallego, Franco, Carella, Thomas, Xue, Zhu, Cole<b>,</b> <i>Advanced Energy Materials</i>, (2018) 1802820 DOI: 10.1002/aenm.201802820.