December 1 - 6, 2024
Boston, Massachusetts
Symposium Supporters
2024 MRS Fall Meeting & Exhibit
BI01.02.07

A Dynamic Multi-Modal Fusion Model for Material Discovery

When and Where

Dec 2, 2024
4:00pm - 4:15pm
Sheraton, Second Floor, Constitution B

Presenter(s)

Co-Author(s)

Indra Priyadarsini S1,Seiji Takeda1,Lisa Hamada1,Hajime Shinohara1

IBM Research-Tokyo1

Abstract

Indra Priyadarsini S1,Seiji Takeda1,Lisa Hamada1,Hajime Shinohara1

IBM Research-Tokyo1
Recent advancements in Artificial Intelligence (AI) and Machine Learning (ML) have created vast opportunities in the field of material discovery, with models trained across various data forms or modalities such as SMILES, SELFIES, molecular graphs, spectrum, properties, etc. spanning across different domains (such as polymers, drugs, crystals). Though these unimodal models are capable of effectively capturing the representations of their respective data modalities or domains, it is further possible for models to gain a more comprehensive understanding of materials from representations learnt from different modalities.<br/><br/>Multimodal models learn to integrate and process information from diverse sources, thus enhancing model robustness and providing deeper insights compared to unimodal models. By leveraging insights from each modality, multimodal models have significantly higher representation power by uncovering patterns that may remain hidden in unimodal models.<br/>Previous attempts at multimodal fusion methods often combined unimodal models through basic concatenation or simple strategies, which rely on paired representations and may overlook challenges due to data scarcity or missing modalities. In this work, we propose a dynamic multimodal fusion model that efficiently combines unimodal representations, adapting dynamically to capture a comprehensive representation as needed.<br/><br/>The core objective of our proposed dynamic multimodal fusion model is to elevate both the robustness and performance of the multimodal model by adaptively tailoring the fusion process to the inputs from distinct unimodal models. The key benefits of our proposed approach include:<br/>1. Dynamic Selection: It allows for the dynamic selection of unimodal inputs that are most likely to enhance the performance of the fused model, effectively filtering out noise or less impactful input modalities.<br/>2. Handling Missing Modalities: Our method adeptly manages scenarios where paired data for different modalities is scarce or unavailable.<br/><br/>To illustrate our method, we demonstrate its efficacy in combining three modalities—namely SMILES, SELFIES, and Molecular Graphs— and benchmark its performance against conventional fusion techniques such as simple concatenation. Our findings reveal that the representation generated through our proposed dynamic fusion strategy significantly surpasses the outcomes achieved by traditional fusion methods on various downstream prediction tasks.<br/><br/>This research presents a flexible and revolutionary way to combine representations from various modalities, paving the way for a more profound comprehension of materials and their properties.

Symposium Organizers

Deepak Kamal, Solvay Inc
Christopher Kuenneth, University of Bayreuth
Antonia Statt, University of Illinois
Milica Todorović, University of Turku

Session Chairs

Matthew Evans
Pascal Friederich

In this Session