Extracting Materials Information from Scientific Articles with Vision Transformer Models

When and Where

Dec 3, 2024
4:45pm - 5:00pm

Hynes, Level 2, Room 210

Presenter(s)

Defne Çirci

Miles Bradley

Bhuwan Dhingra

Catherine Brinson

Co-Author(s)

Defne Çirci¹,Miles Bradley¹,Bhuwan Dhingra¹,Catherine Brinson¹

Duke University¹

Abstract

Defne Çirci¹,Miles Bradley¹,Bhuwan Dhingra¹,Catherine Brinson¹

Duke University¹

The rapidly developing predictive power of modern machine-learning algorithms necessitates parallel advancements in the size and quality of domain-specific datasets for effective training. Unfortunately, many academic domains, such as materials science, lack such datasets due to the unstructured nature of real-world information. Despite the wealth of domain knowledge generated in modern materials science research, much of it remains underutilized because the underlying experimental data is often buried in tables and figures. Vision transformer models present a new opportunity to rapidly and accurately extract data and insights from published literature and transform them into structured data formats. However, current end-to-end image-to-text transformer models for chart-to-table translation fail to capture the diverse and complex nature of materials science figures, making the extraction of underlying data tables a critical yet challenging step. This leads to issues such as inconsistent extraction of axis labels, irregular presentation of row and column data, and neglecting the extraction of line labels from a chart's legend. We propose an approach that decomposes the task of extracting composition and property information from charts into two steps: (1) converting charts to structured formats such as tables, and (2) inputting these tables into language models to obtain sample information with their associated properties and compositions. To address the challenges in step one, we aim to fine-tune a pretrained image-to-text model on materials science figures with complete and consistent annotations. Additionally, we introduce evaluation techniques tailored to the specifics of materials science. Focusing on the subdomain of polymer composites, we demonstrate both the successes and challenges of using multimodal models to extract tabular and chart data. Successfully extracting chart-to-table and experimental sample information can further enable downstream tasks such as question answering, summarization, and property prediction, enhancing the utilization of materials science research data.

Symposium Organizers

Kjell Jorner, ETH Zurich

Jian Lin, University of Missouri-Columbia

Daniel Tabor, Texas A&M University

Dmitry Zubarev, IBM

Session Chairs

Kjell Jorner

Jian Lin

Symposium Supporters

2024 MRS Fall Meeting & Exhibit