MRS Meetings and Events

 

CH01.04.08 2022 MRS Spring Meeting

MaterialEyes—Acceleration of Materials Characterization Insights with Scientific Literature

When and Where

May 10, 2022
4:00pm - 4:15pm

Hawai'i Convention Center, Level 4, Kalakaua Ballroom A

Presenter

Co-Author(s)

Weixin Jiang1,2,Eric Schwenker2,1,Trevor Spreadbury2,Oliver Cossairt1,Maria Chan2

Northwestern University1,Argonne National Laboratory2

Abstract

Weixin Jiang1,2,Eric Schwenker2,1,Trevor Spreadbury2,Oliver Cossairt1,Maria Chan2

Northwestern University1,Argonne National Laboratory2
Due to recent improvements in image resolution and acquisition speed, materials microscopy is experiencing an explosion of published imaging data. The standard publication format, while sufficient for traditional data ingestion scenarios where a select number of images can be critically examined and curated manually, is not conducive to large-scale data aggregation or analysis, hindering data sharing and reuse. In the MaterialEyes project, we utilize computer vision and natural language processing tools to leverage materials characterization data in scientific literature. W<u>e</u> develop the EXSCLAIM Python toolkit [1] for the automatic <b>EX</b>traction, <b>S</b>eparation, and <b>C</b>aption-based natural <b>L</b>anguage <b>A</b>nnotation of <b>IM</b>ages from scientific literature [2]. The EXSCLAIM pipeline allows us to organize the data from the literature in a more machine-processable format. We highlight the methodology [3] behind the construction of EXSCLAIM and demonstrate its ability to extract and label open-source scientific images at high volume. We then utilize the constructed microscopy imaging dataset to determine the contextual contents of experimental microscopy images and the constructed spectroscopy plot dataset to develop machine learning models to study materials properties. In particular, we propose a hybrid image retrieval system to measure both the visual similarity and scale similarity between the experimental measurements and figures crawled from the literature, so that we may use the caption text to interpret the query image. We also develop the Plot2Spectra tool [4] to extract spectra data from spectroscopy graphical plot images so that the extracted data points could be fed into subsequent machine learning models for large-scale data collection or analysis.<br/><br/>References<br/>[1] https://github.com/MaterialEyes/exsclaim<br/>[2] E Schwenker, W Jiang, T Spreadbury, N Ferrier, O Cossairt, MKY Chan, “EXSCLAIM!--An automated pipeline for the construction of labeled materials imaging datasets from literature,” arXiv preprint arXiv:2103.10631.<br/>[3] W Jiang, E Schwenker, T Spreadbury, N Ferrier, MKY Chan, O Cossairt, “A Two-stage Framework for Compound Figure Separation,” 2021 IEEE International Conference on Image Processing (ICIP), DOI: 10.1109/ICIP42928.2021.9506171.<br/>[4] W Jiang, E Schwenker, T Spreadbury, K Li, MKY Chan, O Cossairt, “Plot2Spectra: an Automatic Spectra Extraction Tool,” arXiv preprint arXiv:2107.02827.<br/><br/>Acknowledgements<br/>This material is based upon work supported by Laboratory Directed Research and Development (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357. M.C. acknowledges the support from the BES SUFD Early Career award. Use of the Center for Nanoscale Materials, an Office of Science user facility, was supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357.

Symposium Organizers

Wenpei Gao, North Carolina State University
Arnaud Demortiere, Universite de Picardie Jules Verne
Madeline Dressel Dukes, Protochips, Inc.
Yuzi Liu, Argonne National Laboratory

Symposium Support

Silver
Protochips

Publishing Alliance

MRS publishes with Springer Nature