Tutorial EQ04: Symmetry-Aware Neural Networks for the Material Sciences

November 29, 2021, 8:30 AM - 5:00 PM ET

Hynes, Level 2, Room 205

Understanding the role of symmetry in the physical sciences is critical for choosing an appropriate machine-learning method. Symmetries such as Euclidean symmetry, permutation symmetry, U(1) gauge symmetry, and indistinguishability of particles give rise to the behavior of physical systems and often at the core of exotic properties.

In this tutorial, we will present multiple approaches for incorporating symmetries into machine learning methods, focusing on neural networks. We will discuss differences in mathematical complexity and expressivity of invariance vs. equivariance to Euclidean and permutation symmetries. We will explore unexpected consequences of treating symmetry in machine learning models and do a deep dive into the capabilities of Euclidean neural networks. We will share perspectives on open research questions in neural network design for understanding, discovering, and designing materials.

The morning session includes an overview of symmetry considerations in machine learning, a concrete and tailored introduction to group representation theory for the purposes of symmetry-aware computation and machine learning, and hands-on tutorials for constructing Euclidean equivariant operations using the open-source PyTorch framework for Euclidean neural networks e3nn. The afternoon session will demonstrate specific use cases of symmetry-aware methods for diverse applications (molecular dynamics, representation of geometry, and crystal properties).

We will provide a cloud-based notebook environment for participants to run the code example used throughout the tutorial. Participants will leave the tutorial with theory and code resources in hand and a practical working knowledge of how symmetry considerations impact algorithm design in machine learning and beyond.

Euclidean Symmetry in Machine Learning for Materials Science
Tess Smidt, Lawrence Berkeley National Laboratory

Understanding symmetry’s role in the physical sciences is critical for choosing an appropriate machine learning method. For example, coordinates used to describe positions of atoms in a material are traditionally a challenging data type to use for machine learning -- coordinates and coordinate systems are sensitive to the symmetries of 3D space: 3D rotations, translations, and inversion. One of the motivations for incorporating symmetry into machine learning models on 3D data is that it eliminates the need for data augmentation -- the 500-fold increase in brute-force training necessary for a model to learn 3D patterns in arbitrary orientations. Additionally, many features of physical systems are consequences of symmetry -- geometric tensor properties, point groups, and space groups, degeneracy, phase transitions, atomic orbitals, etc. By incorporating symmetry into a model by construction (rather than by training), these consequences arise naturally in the behavior of the model (even in untrained models). There are two general types of symmetry-aware models: invariant and equivariant. Invariant models get rid of coordinate systems by only dealing with quantities that are invariant to the choice of coordinate system (scalars), while equivariant models preserve how quantities predictably change under coordinate transformations. While invariant models are the most prevalent symmetry-aware models because they are mathematically simpler, equivariant models more faithfully represent the complexity of physical interactions. In this tutorial, we will discuss how symmetry emerges when representing physical systems and strategies for accommodating these symmetries when building machine learning algorithms. We will give an overview of some of the properties of Euclidean neural networks, a general neural network framework that fully treats the equivariance of physical systems and naturally handles 3D geometry and operates on the scalar, vector, and tensor fields that characterize them. Later lectures will dive into how these and related methods are implemented and applied to real-world materials challenges.

Group Theory, Irreducible Representations, and Tensor Products and How to Use them in e3nn to Build Euclidean Neural Networks
Mario Geiger, EPFL

In this tutorial, we will introduce useful concepts group theory for creating symmetry invariant and equivariant algorithms for materials: representation of groups and the vector spaces they act on, including irreducible representations, and how to interact and combine group representations, e.g. tensor products.

We will connect this theory to practical examples in material science, for example we will give examples of tensor properties of materials and how to express them in terms of Cartesian tensors and irreducible representations.

Euclidean neural networks use these group theoretic principles to achieve global and local equivariance to 3D rotations, translations, and inversion at every layer. We will go through concrete coding examples for how to build these models using e3nn: a modular open-source PyTorch framework for Euclidean Neural Networks. In e3nn, the group theoretic tools discussed above are implemented as practical PyTorch modules. We will cover the Irreps class and operations such as TensorProduct. We will also touch upon topics such as nonlinearities and how these differ from traditional neural networks. The e3nn package will continue to be used throughout the morning and afternoon tutorials to demonstrate core concepts in using invariant and equivariant algorithms for practical materials applications.

Analyzing Geometry and Structure of Atomic Configurations with Equivariant and Invariant Functions
Martin Uhrin, EPFL, and Thomas Hardin, Sandia National Laboratories

Some of the most challenging aspects of applying machine learning to atomic configurations have to do with interfacing atomic configurations (point clouds and species labels) with machine learning methods in a way that 1) respects geometric symmetry, 2) is flexible with respect to the number of atoms in a configuration, and 3) respects indistinguishability of atoms. e3nn handles each of these issues naturally, making it an ideal framework for machine learning on atomic configurations.

In this hands-on lecture, we will use e3nn to 1) demonstrate how to project atomic environments onto basis function expansions (using the SphericalTensor and FourierTensor objects in e3nn), 2) use those expansions to calculate invariant representations of the atomic configurations, and 3) show how to use optimization techniques and the automatic differentiation capabilities of e3nn to recover geometry from these representations.

Molecular Dynamics with NequIP
Simon Batzner, Harvard University

Machine-Learning Interatomic Potentials have over the past decade emerged as a powerful tool for increasing time- and length-scales of molecular dynamics simulations while retaining the high accuracy of the reference calculations they were trained on. Recently, the NequIP potential, an equivariant neural network-based interatomic potential has been demonstrated to obtain an unprecedented level of accuracy and sample efficiency, allowing ML Interatomic Potentials to be trained with up to 1000x fewer training data than competing methods.

We will demonstrate how to train and subsequently deploy NequIP interatomic potentials in molecular dynamics simulations. We will outline best practices for data set selection and training of the NequIP model as well as how to successfully run efficient simulations. We will give an introduction to the NequIP software code and detail how to work with its interface to the LAMMPS molecular dynamics software suite, a highly efficient code for large-scale molecular dynamics simulations, enabling the efficient and accurate study of a diverse range of materials.

Predicting Electron Densities with e3nn
Josh Rackers, Sandia National Laboratories

One of the fundamental challenges of atomic-scale science for the 21st century is the ability to simulate large molecules with quantum mechanical accuracy. What makes this problem so challenging is the poor scaling of solving the equations of quantum mechanics on a classical computer. Even the most efficient quantum chemistry programs, such as Density Functional Theory (DFT), scale with the cube of the system size. This makes accurate simulations of large molecules with these methods fundamentally impossible.

One proposed method of breaking this “quantum scaling limit” is to use machine learning (ML) to learn quantum mechanics. In this tutorial, we will demonstrate how Euclidean Neural Networks in general, and e3nn in particular, are well-suited for this task. As an instructive example, we will show how to predict the electron density of clusters of water molecules and examine how that model can transfer to larger systems.

Predicting Phonon Properties of Crystal Structures
Zhantao Chen and Nina Andrejevic, Massachusetts Institute of Technology

Phonon density of states (DoS) is a key determinant of many properties of crystalline solids. However, acquisition of phonon DoS is experimentally nontrivial due to limited inelastic scattering facility resources, and at the same time computationally expensive for complex materials, particularly disordered systems. These challenges further lead to a scarcity of phonon DoS data to perform data-driven studies.

e3nn enables efficient and direct prediction of phonon DoS using easily accessible input information about atomic structures, specifically atom types, masses, and positions. By directly incorporating crystallographic and 3D spatial symmetry constraints, e3nn can achieve good performance even when trained on a modest phonon DoS dataset of ~1000 examples. In this lecture, we will demonstrate how to predict phonon DoS from crystal structures using e3nn, as well as introduce the natural extension of predicting the DoS of alloy materials.

Tutorials