MRS Meetings and Events

 

DS01.14.03 2022 MRS Spring Meeting

AI Physicist—Data-Driven Discovery of Mathematical Expressions via Natural Language Processing

When and Where

May 13, 2022
2:00pm - 2:15pm

Hawai'i Convention Center, Level 3, Lili'U Theater, 310

Presenter

Co-Author(s)

Juwon Na1,Seungchul Lee1

Pohang University of Science and Technology1

Abstract

Juwon Na1,Seungchul Lee1

Pohang University of Science and Technology1
Natural phenomena can be described by concise mathematical expressions. A central challenge in natural sciences and engineering, therefore, lies in symbolic regression: discovering a simple but accurate symbolic expression that fits a given dataset. However, the combinatorial nature of symbolic regression makes the task challenging. In this work, we present a mathematical language model, which leverages the representational capacity of natural language processing (NLP) models for symbolic regression. Specifically, our framework involves three main stages: (1) mathematical expression as language, (2) mathematical language modeling, and (3) bridge mathematical language modeling with reinforcement learning. With extensive experiments on several symbolic regression benchmarks, we demonstrate that our framework improves the ability to recover mathematical expressions from data in terms of (1) accuracy, (2) noise tolerance, and (3) inclusion of dummy input variables. Our contribution includes the framework that recasts the problem of symbolic regression as natural language understanding tasks, allowing symbolic regression researchers to leverage recent breakthroughs in language modeling.

Symposium Organizers

Mathieu Bauchy, University of California, Los Angeles
Mathew Cherukara, Argonne National Laboratory
Grace Gu, University of California, Berkeley
Badri Narayanan, University of Louisville

Publishing Alliance

MRS publishes with Springer Nature