Neuro-Symbolic Reinforcement Learning for Polymer Discovery

When and Where

May 12, 2022
8:30am - 8:45am

Hawai'i Convention Center, Level 3, Lili'U Theater, 310

Presenter

Sarathkrishna Swaminathan

Dmitry Zubarev

Tim Erdmann

Subhajit Chaudhury

Asim Munawar

Co-Author(s)

Sarathkrishna Swaminathan¹,Dmitry Zubarev¹,Tim Erdmann¹,Subhajit Chaudhury¹,Asim Munawar¹

IBM Research¹

Abstract

Sarathkrishna Swaminathan¹,Dmitry Zubarev¹,Tim Erdmann¹,Subhajit Chaudhury¹,Asim Munawar¹

IBM Research¹

We present the first application of neuro-symbolic reinforcement learning (RL) in materials discovery domain. Deep reinforcement learning requires excessively large volume of training data, and the learned policies lack explainability. As a result, practical application of deep RL in material discovery is problematic. We explore neuro-symbolic approaches to deep learning that combine the strengths of data-driven AI with the capabilities of human-like symbolic knowledge and reasoning [1]. This results in AI that can learn with less resources (compute, data, etc.) and results in an explainable model. Neuro-symbolic approaches are anticipated to enable co-creation of models/policy with subject matter experts (SMEs) by capturing new domain knowledge in symbolic form – this feature is particularly important in learning safety constraint that AI should follow. We investigate Logical Neural Networks (LNNs) where each neuron has an explicit meaning as a part of a formula in a weighted real-valued logic. In addition, the model is differentiable, and learning helps in learning new rules and make the network resilient against contradicting facts. In the presented study we use Logical Optimal Actions (LOA) [3, 4], a neuro-symbolic RL framework based on LNN, to train RL agents to select experimental conditions for the synthesis of spin-on-glass (SOG) given target values of experimental outcomes. The SOG is based on tetraethyl orthosilicate as the precursor and co-precursors such as phenyltriethoxysilane. Experimental degrees of freedom include temperature, reaction time, precursor/co-precursor ratio, total co-/precursor concentration, water/precursor ratio, and catalyst/precursor ratio. We explicitly pursue training of generalizable agents that learn to navigate abstract space of experiments relevant to SOG synthesis to find reaction conditions that yield materials with desired properties. We introduce a data-augmentation strategy to meet data requirements of reinforcement learning while maintaining affordable volume of experimental data – under 300 experimental data points. Neuro-symbolic RL experiments show that the LOA in combination with logical action-aware features noticeably improves agent's performance in the search for the experiments targeting specific molecular weight and polydispersity index of the produced SOG. Furthermore, the agent learns to avoid experimental conditions that produce undesirable outcomes: for example, the agent avoids conditions leading to gelation of the reaction mixture. Finally, we validate and benchmark the proposed neuro-symbolic RL approach by running spin-on-glass synthesis in the lab following AI agent predictions. [1] Pavan Kapanipathi et al. Leveraging Abstract Meaning Representation for Knowledge Base Question Answering. ACL-IJCNLP 2021 [2] Ryan Riegel et al. Logical neural networks. CoRR, abs/2006.13155, 2020. [3] Subhajit Chaudhury, et. al., Neuro-Symbolic Approaches for Text-Based Policy Learning, EMNLP 2021 [4] Daiki Kimura, et. al., Neuro-Symbolic Reinforcement Learning with First-Order Logic, EMNLP 2021