Dec 4, 2024
8:00pm - 10:00pm
Hynes, Level 1, Hall A
Yunrui Li1,Hao Xu2,Pengyu Hong1
Brandeis University1,Harvard University2
Yunrui Li1,Hao Xu2,Pengyu Hong1
Brandeis University1,Harvard University2
Nuclear magnetic resonance (NMR) spectroscopy is crucial in advancing materials science, as it reveals detailed structural information, electronic properties, and molecular dynamic insights. Accurate prediction of NMR peaks from molecular structures enables materials scientists to effectively evaluate and verify candidate structures by comparing predictions with observed shifts in experimental NMR spectra. While substantial progress has been achieved in predicting one-dimensional (1D) NMR using Machine Learning (ML) approaches, 2D NMR prediction remains a challenge due to the scarcity of annotated training data. In this work, we present a modular, AI-driven approach designed for automated cross peak prediction and annotation of experimental 2D NMR data. Specifically, we developed an unsupervised transfer learning framework, to train a deep learning model and achieved promising results on both prediction and annotation accuracy. We deployed our pipeline on 19,000 unlabeled HSQC spectra for training, and 400 HSQC spectra with expert annotations for testing. The Mean Absolute Errors (MAEs) achieved by the model demonstrate the model outperforms the conventional tools (<i>ChemDraw</i> and <i>Mestrenova</i>) for predicting 2D NMR atomic chemical shifts. The effectiveness of our approach highlights the potential for unsupervised learning and transfer learning in the absence of labeled experimental data, showcasing the broader implications of integrating AI into next-generation structural verification and discovery workflows.