Understanding
the role of symmetry in the physical sciences is critical for choosing
an appropriate machine-learning method. Symmetries such as Euclidean
symmetry, permutation symmetry, U(1) gauge symmetry, and
indistinguishability of particles give rise to the behavior of physical
systems and often at the core of exotic properties.
In
this tutorial, we will present multiple approaches for incorporating
symmetries into machine learning methods, focusing on neural networks.
We will discuss differences in mathematical complexity and expressivity
of invariance vs. equivariance to Euclidean and permutation symmetries.
We will explore unexpected consequences of treating symmetry in machine
learning models and do a deep dive into the capabilities of Euclidean
neural networks. We will share perspectives on open research questions
in neural network design for understanding, discovering, and designing
materials.
The morning session includes an overview of symmetry considerations in machine learning, a
concrete and tailored introduction to group representation theory for
the purposes of symmetry-aware computation and machine learning, and
hands-on tutorials for constructing Euclidean equivariant operations
using the open-source PyTorch framework for Euclidean neural networks
e3nn. The afternoon session will demonstrate specific use cases of
symmetry-aware methods for diverse applications (molecular dynamics,
representation of geometry, and crystal properties).
We
will provide a cloud-based notebook environment for participants to run
the code example used throughout the tutorial. Participants will leave
the tutorial with theory and code resources in hand and a practical
working knowledge of how symmetry considerations impact algorithm design
in machine learning and beyond.
Euclidean Symmetry in Machine Learning for Materials Science
Tess Smidt, Lawrence Berkeley National Laboratory
Understanding
symmetry’s role in the physical sciences is critical for choosing an
appropriate machine learning method. For example, coordinates used to
describe positions of atoms in a material are traditionally a
challenging data type to use for machine learning -- coordinates and
coordinate systems are sensitive to the symmetries of 3D space: 3D
rotations, translations, and inversion. One of the motivations for
incorporating symmetry into machine learning models on 3D data is that
it eliminates the need for data augmentation -- the 500-fold increase in
brute-force training necessary for a model to learn 3D patterns in
arbitrary orientations. Additionally, many features of physical systems
are consequences of symmetry -- geometric tensor properties, point
groups, and space groups, degeneracy, phase transitions, atomic
orbitals, etc. By incorporating symmetry into a model by construction
(rather than by training), these consequences arise naturally in the
behavior of the model (even in untrained models). There are two general
types of symmetry-aware models: invariant and equivariant. Invariant
models get rid of coordinate systems by only dealing with quantities
that are invariant to the choice of coordinate system (scalars), while
equivariant models preserve how quantities predictably change under
coordinate transformations. While invariant models are the most
prevalent symmetry-aware models because they are mathematically simpler,
equivariant models more faithfully represent the complexity of physical
interactions. In this tutorial, we will discuss how symmetry emerges
when representing physical systems and strategies for accommodating
these symmetries when building machine learning algorithms. We will give an overview of some of the properties of Euclidean
neural networks, a general neural network framework that fully treats
the equivariance of physical systems and naturally handles 3D geometry
and operates on the scalar, vector, and tensor fields that characterize
them. Later lectures will dive into how these and related methods are
implemented and applied to real-world materials challenges.
Group Theory, Irreducible Representations, and Tensor Products and How to Use them in e3nn to Build Euclidean Neural Networks
Mario Geiger, EPFL
In
this tutorial, we will introduce useful concepts group theory for
creating symmetry invariant and equivariant algorithms for materials:
representation of groups and the vector spaces they act on, including
irreducible representations, and how to interact and combine group
representations, e.g. tensor products.
We
will connect this theory to practical examples in material science, for
example we will give examples of tensor properties of materials and how
to express them in terms of Cartesian tensors and irreducible
representations.
Euclidean neural networks use these group
theoretic principles to achieve global and local equivariance to 3D
rotations, translations, and inversion at every layer. We will go
through concrete coding examples for how to build these models using
e3nn: a modular open-source PyTorch framework for Euclidean Neural
Networks. In e3nn, the group theoretic tools
discussed above are implemented as practical PyTorch modules. We will
cover the Irreps class and operations such as TensorProduct. We will
also touch upon topics such as nonlinearities and how these differ from
traditional neural networks. The e3nn package will continue to be used
throughout the morning and afternoon tutorials to demonstrate core
concepts in using invariant and equivariant algorithms for practical
materials applications.
Analyzing Geometry and Structure of Atomic Configurations with Equivariant and Invariant Functions
Martin Uhrin, EPFL, and Thomas Hardin, Sandia National Laboratories
Some
of the most challenging aspects of applying machine learning to atomic
configurations have to do with interfacing atomic configurations (point
clouds and species labels) with machine learning methods in a way that
1) respects geometric symmetry, 2) is flexible with respect to the
number of atoms in a configuration, and 3) respects indistinguishability
of atoms. e3nn handles each of these issues naturally, making it an
ideal framework for machine learning on atomic configurations.
In
this hands-on lecture, we will use e3nn to 1) demonstrate how to
project atomic environments onto basis function expansions (using the
SphericalTensor and FourierTensor objects in e3nn), 2) use those
expansions to calculate invariant representations of the atomic
configurations, and 3) show how to use optimization techniques and the
automatic differentiation capabilities of e3nn to recover geometry from
these representations.
Molecular Dynamics with NequIP
Simon Batzner, Harvard University
Machine-Learning
Interatomic Potentials have over the past decade emerged as a powerful
tool for increasing time- and length-scales of molecular dynamics
simulations while retaining the high accuracy of the reference
calculations they were trained on. Recently, the NequIP potential, an
equivariant neural network-based interatomic potential has been
demonstrated to obtain an unprecedented level of accuracy and sample
efficiency, allowing ML Interatomic Potentials to be trained with up to
1000x fewer training data than competing methods.
We will demonstrate how to train and subsequently deploy
NequIP interatomic potentials in molecular dynamics simulations. We
will outline best practices for data set selection and training of the
NequIP model as well as how to successfully run efficient simulations.
We will give an introduction to the NequIP software code and detail how
to work with its interface to the LAMMPS molecular dynamics software
suite, a highly efficient code for large-scale molecular dynamics
simulations, enabling the efficient and accurate study of a diverse
range of materials.
Predicting Electron Densities with e3nn
Josh Rackers, Sandia National Laboratories
One
of the fundamental challenges of atomic-scale science for the 21st
century is the ability to simulate large molecules with quantum
mechanical accuracy. What makes this problem so challenging is the poor
scaling of solving the equations of quantum mechanics on a classical
computer. Even the most efficient quantum chemistry programs, such as
Density Functional Theory (DFT), scale with the cube of the system size.
This makes accurate simulations of large molecules with these methods
fundamentally impossible.
One proposed
method of breaking this “quantum scaling limit” is to use machine
learning (ML) to learn quantum mechanics. In this tutorial, we will
demonstrate how Euclidean Neural Networks in general, and e3nn in
particular, are well-suited for this task. As an instructive example, we
will show how to predict the electron density of clusters of water
molecules and examine how that model can transfer to larger systems.
Predicting Phonon Properties of Crystal Structures
Zhantao Chen and Nina Andrejevic, Massachusetts Institute of Technology
Phonon
density of states (DoS) is a key determinant of many properties of
crystalline solids. However, acquisition of phonon DoS is experimentally
nontrivial due to limited inelastic scattering facility resources, and
at the same time computationally expensive for complex materials,
particularly disordered systems. These challenges further lead to a
scarcity of phonon DoS data to perform data-driven studies.
e3nn
enables efficient and direct prediction of phonon DoS using easily
accessible input information about atomic structures, specifically atom
types, masses, and positions. By directly incorporating crystallographic
and 3D spatial symmetry constraints, e3nn can achieve good performance
even when trained on a modest phonon DoS dataset of ~1000 examples. In
this lecture, we will demonstrate how to predict phonon DoS from crystal
structures using e3nn, as well as introduce the natural extension of
predicting the DoS of alloy materials.