In this tutorial, we will introduce how to manage and utilize data resources, e.g., through data curation and visualization, with a special focus on research data for polymers and their composites. We will cover how data flows through a pipeline that ingests text, tables and charts from resources (such as a journal paper), then converts them into data points that could be queried and cross-used in an interactive notebook. As an example, we will use MaterialsMine (https://materialsmine.org/nm#/), an open-source data resource for material-science-related data, to illustrate the data flow and showcase how data resources can be leveraged for data exploration and visualization in materials science. The tutorial will be in the form of short lectures, followed by hands-on activities in which the attendees will work on prepared data curation and visualization examples.
Upon completion of the tutorial, a learner will be able to:
- Interact with data in MaterialsMine
- Query databases/knowledge graphs
- Extract data for benchmarking and analysis
- Use research data for materials design
- Clean data and stage curation
This tutorial will be valuable to researchers in the field of polymer systems who are interested in learning about data platforms, how to archive their data and how to access others' data for benchmarking and further analysis. We will cover general data cleaning, visualization and analysis tools, and use digital notebooks as a portal to interact with data resources. We will focus on one data platform with methods which generalize to interaction with other materials data platforms and should introduce interested researchers to the fundamentals and expand their fluency in data repositories.
Tutorial Schedule
- Introduction
Introduction and motivation of Materials Informatics, Findable Accessible Interoperable Reusable (FAIR) data and MaterialsMine. - Ontology and Knowledge Graph
An overview of ontology and knowledge graph used to power MaterialsMine - Materials Data Visualization
An overview of using visualization tools - Accessing data in MaterialsMine
A showcase of manual access to the raw data in MaterialsMine
A showcase of programmatic access to the raw data in MaterialsMine - Materials design with reactive notebooks
Data integration using reactive notebooks
Case study: Benchmarking Materials Data - Data curation and cleaning with MaterialsMine
Overview of data representations, data extraction, transformation, and loading (ETL) processes
Data ETL process with homogenization in MaterialsMine