Tutorial MT02/MT03—Applied Machine Learning for Materials Research and Development

December 1, 2019, 8:30 AM - 5:00 PM ET

Hynes, Level 2, Room 210

This will be a hands-on tutorial. We will begin with an overview of applied machine learning fundamentals. Then we will showcase successful applications of machine learning to solve materials R&D challenges including diagnosis, process optimization and discovery/design. We will also review some of the hardware developments that have emerged in parallel with machine learning tools. In closing, a perspective on future challenges in this field will be offered.

8:30 am
Hands-On Deep Learning for Materials
Edward Kim, Citrine Informatics

A brief conceptual overview of basic deep learning concepts will be provided, and these concepts will be introduced alongside more familiar (e.g., linear) machine learning methods. Attendees will then have the opportunity to build and train their own neural network models in Jupyter + Python.

10:00 am BREAK

10:30 am
Success Cases
Edward Kim, Citrine Informatics; Joshua Schrier, Fordham University; Tonio Buonassisi, Massachusetts Institute of Technology

Citrine will highlight recent success cases in sequential (active) learning and related methods.

1:30 pm
Taking ML into the Lab: Hardware Developments
Joshua Schrier, Fordham University

ML and AI are "brains" making sense of data, but we also need "hands" to conduct new experiments and collect the results. This section of the tutorial will provide an overview of the general classes of automated and high-throughput synthesis and characterization methods, highlight efforts at a variety of scales, discuss shared challenges, and provide resources for further learning. The intended audience includes both non-experimentalists who want to learn more about where their data come from and "traditional" experimentalists interested in learning about possibilities for automating the performance and data collection of their experiments to facilitate interactions with machine learning.

Where does your data come from and how did it get there?

A brief history of laboratory automation and experiments-for-data's-sake.
Survey of current initiatives, efforts and challenges

Small, medium, or large? Experimentation scale in space and time.

Highlights of national-scale through laboratory-scale instrumentation capabilities.
Survey of current initiatives, efforts and challenges

Closing the loop: Autonomy and the need for Software and Data Management

Software environments for managing experiment requests and data export.
Survey of current initiatives, efforts and challenges

3:00 pm BREAK

3:30 pm
What’s the Best Experiment to Do Next? An Introduction to Gaussian Processes and Active Learning
A. Gilad Kusne, National Institute of Standards and Technology

A common challenge is identifying the best (most informative) experiment to perform next in the laboratory or in silico. Active learning provides a framework for doing just that. This tutorial will use open source tools (Anaconda, Jupyter, scikit learn, etc.) and hands-on exercises in Jupyter to introduce attendees to Gaussian Processes and Active Learning. Attendees will first learn about the Bayesian method of Gaussian process regression, a regression method that provides uncertainty for its predictions. We will then investigate a variety of Active Learning and Bayesian Optimization schemes that exploit these predictions (and their uncertainties) for selecting the next experiment to perform. Attendees who want to follow along on their own laptops should come with Anaconda Python 3.7 already installed.

Gaussian Processes

Introduction to Gaussian Process theory
Hands-on exercises

Active Learning and Bayesian Optimization

Introduction to a variety of Active Learning schemes
Hands-on exercises