Dec 3, 2024
4:45pm - 5:00pm
Hynes, Level 2, Room 210
Defne Çirci1,Miles Bradley1,Bhuwan Dhingra1,Catherine Brinson1
Duke University1
Defne Çirci1,Miles Bradley1,Bhuwan Dhingra1,Catherine Brinson1
Duke University1
The rapidly developing predictive power of modern machine-learning algorithms necessitates parallel advancements in the size and quality of domain-specific datasets for effective training. Unfortunately, many academic domains, such as materials science, lack such datasets due to the unstructured nature of real-world information. Despite the wealth of domain knowledge generated in modern materials science research, much of it remains underutilized because the underlying experimental data is often buried in tables and figures. Vision transformer models present a new opportunity to rapidly and accurately extract data and insights from published literature and transform them into structured data formats. However, current end-to-end image-to-text transformer models for chart-to-table translation fail to capture the diverse and complex nature of materials science figures, making the extraction of underlying data tables a critical yet challenging step. This leads to issues such as inconsistent extraction of axis labels, irregular presentation of row and column data, and neglecting the extraction of line labels from a chart's legend. We propose an approach that decomposes the task of extracting composition and property information from charts into two steps: (1) converting charts to structured formats such as tables, and (2) inputting these tables into language models to obtain sample information with their associated properties and compositions. To address the challenges in step one, we aim to fine-tune a pretrained image-to-text model on materials science figures with complete and consistent annotations. Additionally, we introduce evaluation techniques tailored to the specifics of materials science. Focusing on the subdomain of polymer composites, we demonstrate both the successes and challenges of using multimodal models to extract tabular and chart data. Successfully extracting chart-to-table and experimental sample information can further enable downstream tasks such as question answering, summarization, and property prediction, enhancing the utilization of materials science research data.