MRS Meetings and Events

 

DS01.09.07 2022 MRS Spring Meeting

Reinforcement Learning for Molecule Space Exploration: Conditioned Latent Representations via Large Scale Self-Supervised Learning

When and Where

May 11, 2022
3:30pm - 3:45pm

Hawai'i Convention Center, Level 3, Lili'U Theater, 310

Presenter

Co-Author(s)

Chih-Hsuan (Bella) Yang1,Hsin-Jung Yang1,Vinayak Bhat2,Parker Sornberger2,Balaji Pokuri1,Chad Risko2,Baskar Ganapathysubramanian1

Iowa State University1,University of Kentucky2

Abstract

Chih-Hsuan (Bella) Yang1,Hsin-Jung Yang1,Vinayak Bhat2,Parker Sornberger2,Balaji Pokuri1,Chad Risko2,Baskar Ganapathysubramanian1

Iowa State University1,University of Kentucky2
We build upon recent advances in constructing continuous latent representations of molecules, with a focus on utilizing this latent representation for downstream molecule space exploration using reinforcement learning. We particularly explore two issues: (a) the potential for distributional shift of the latent space during exploration and (b) considerations of synthesizability and (in general) multi-property design.<br/>We study the first issue by exploring how various self-supervised losses (clustering, contrastive, minimizing redundancy, etc) along with a computationally generated ‘dense sampling’ of the molecule space can produce good latent space representations. We provide heuristic metrics on what we mean by ‘good’ latent space representations. We utilize the representation of self-referencing embedded strings (SELFIES) representation to train a large network using over 250 million molecules. We study the second issue by designing a multi-objective reinforcement learning agent which learns properties from the latent space. As a toy example, we utilize pre-trained neural networks that predict ionization potential, and HOMO-LUMO gaps. We illustrate the workflow that is a combination of a large-scale self-supervised framework that produces conditioned latent representations that feed a model-based reinforcement learning framework for molecular design. Our large-scale model is available for other researchers to use as a pre-trained model. This is collaborative work with the Risko Group (Kentucky) and Sarkar group (Iowa State).

Keywords

organic

Symposium Organizers

Mathieu Bauchy, University of California, Los Angeles
Mathew Cherukara, Argonne National Laboratory
Grace Gu, University of California, Berkeley
Badri Narayanan, University of Louisville

Publishing Alliance

MRS publishes with Springer Nature