MRS Meetings and Events

 

MD02.09.04 2023 MRS Spring Meeting

Growing Strings in a Chemical Reaction Space for Searching Retrosynthesis Pathways

When and Where

Apr 25, 2023
8:50am - 9:05am

MD02-virtual

Presenter

Co-Author(s)

Federico Zipoli1,Carlo Baldassari1,Matteo Manica1,Jannis Born1,Teodoro Laino1

IBM Research Zurich1

Abstract

Federico Zipoli1,Carlo Baldassari1,Matteo Manica1,Jannis Born1,Teodoro Laino1

IBM Research Zurich1
Machine learning algorithms demonstrated remarkable accuracy in predicting the outcomes of chemical reactions, often outperforming human experts. Simultaneously, a high level of precision was achieved in the single-step retrosynthesis prediction problem. However, designing a synthesis pathway leading to a given product is a challenging task that runs up against the limits of many currently available ML-driven algorithms. Like the game of chess, retrosynthesis route prediction entails putting together a series of steps to create a given product from existing substances, with the goal of optimizing the synthesis efficiency by taking advantage of specific strategical game rules like protection, deprotection, FGI, etc. Because current machine learning models are trained on single reaction steps, they lack knowledge of these strategy rules. Here, we recast the retrosynthesis problem as a string optimization problem, capitalizing on the homology between the chemical reaction space and a multidimensional geometrical space. If we think of chemical reactions as multidimensional vectors (fingerprints), then a synthesis in this space is a string that involves three or more connected fingerprints. An extensive corpus of chemical synthesis, comprising approximately 1.2M examples, was extracted and added as strings to the chemical reaction space. We use the Euclidean metric to minimize the distance between the trajectory of the growing retrosynthesis string and the existing strings. By doing so, we aim to assemble steps that, in the chemical reaction space, will grow along paths more similar to existing retrosynthesis, thereby inheriting the strategic guidelines compiled by domain experts. We integrated this approach into the RXN platform (https://rxn.res.ibm.com/) and present the method's application to complex synthesis as well as its ability to produce better synthetic strategies than current methodologies.

Keywords

chemical reaction | chemical synthesis

Symposium Organizers

Soumendu Bagchi, Los Alamos National Laboratory
Huck Beng Chew, The University of Illinois at Urbana-Champaign
Haoran Wang, Utah State University
Jiaxin Zhang, Oak Ridge National Laboratory

Symposium Support

Bronze
Patterns and Matter, Cell Press

Publishing Alliance

MRS publishes with Springer Nature