Apr 9, 2025
11:00am - 11:30am
Summit, Level 3, Room 344
Katherine Sytwu1,Luis Rangel DaCosta2,Catherine Groschner2,Min Gee Cho1,Mary Scott2,1
Lawrence Berkeley National Laboratory1,University of California, Berkeley2
Katherine Sytwu1,Luis Rangel DaCosta2,Catherine Groschner2,Min Gee Cho1,Mary Scott2,1
Lawrence Berkeley National Laboratory1,University of California, Berkeley2
Neural networks and similar data-driven machine learning techniques have emerged as extremely powerful data science tools that can potentially automate information extraction out of large,
in situ TEM datasets with better accuracy and speed. However, many of these models have difficulty generalizing, or performing well on data that differs from the data used during model training. This inability to generalize across different datasets has consequences for deploying these models on data streams with changing conditions, like
in situ TEM data. Without an idea of what training protocols are needed for a robust network or under what conditions a model might fail, one would need to either retrain neural networks for every dataset one wants to analyze, or acquire/label a large dataset of TEM images that spans every sample/imaging condition, both of which prohibit wider usage of neural networks for TEM analysis.
In this talk, I’ll discuss how curated TEM datasets have enabled us to explore how choices in training data, data representation, and model architecture can bias the output of a machine learning model in the context of nanoparticle characterization. First, we show how data preprocessing, or the conversion of raw data into a form suitable for algorithm input, affects machine learning model performance in both nanoparticle identification and shape analysis in high-resolution TEM images. By training and cross-validating neural networks with our curated datasets, we not only identify under what conditions these neural networks generalize, but also uncover the biases that arise when test data differs from training data. Next, we show how neural network model architecture features, namely the receptive field, can affect model performance. By combining systematically varied neural network architectures with these curated datasets, we find that a large receptive field is needed for identifying low contrast nanoparticle features. Overall, our results point towards the need for increased analysis on the data-inputs of our data-driven algorithms, and for further quantitative studies on the robustness of our trained models.