Marcus Noack1
Lawrence Berkeley National Laboratory1
Marcus Noack1
Lawrence Berkeley National Laboratory1
The fields of machine learning (ML) and artificial intelligence (AI) have transformed almost every aspect of science and engineering. The excitement for AI/ML methods is in large part due to their perceived novelty, as compared to traditional methods of statistics, computation, and applied mathematics. But clearly, all methods in ML have their foundations in mathematical theories, such as function approximation, uncertainty quantification, and function optimization. Autonomous experimentation is no exception; it is often formulated as a chain of off-the-shelf tools, organized in a closed loop, without emphasis on the intricacies of each algorithm involved. The uncomfortable truth is that the success of any ML endeavor, and this includes autonomous experimentation, strongly depends on the sophistication of the underlying mathematical methods and software that have to allow for enough flexibility to consider functions that are in agreement with particular physical theories, knowledge, and intuition. We have observed that standard off-the-shelf tools, used by many in the applied ML community, often hide the underlying complexities and therefore perform poorly. In this talk, I want to give a perspective on the intricate connections between mathematics and ML, with a focus on Gaussian processes for uncertainty quantification and autonomous experimentation. Although the Gaussian process is a powerful mathematical concept, it has to be implemented and customized correctly for optimal performance; it is often criticized for unrealistic uncertainty quantification and poor scaling when used in its standard setup. The reason, however, is often not the method itself but missing flexibility and domain awareness of the underlying prior probability distribution. I will start this talk by discussing some recent examples in which GPs were applied to various approximation and decision-making problems; we will discover, by example, where the challenges, intricacies, and complexities of this methodology lie, and subsequently, how they can be addressed to yield improved performance. We will continue this thought process by focusing on how the right customizations can improve autonomous experimentation. I will present several simple toy problems to explore these nuances and highlight the importance of mathematical and statistical rigor in autonomous experimentation, uncertainty quantification, and ML.