Department of Energy Awards Grant to The HDF Group and Collaborators for Fusion Energy Data Management Tools
This project will harness heterogeneous data from multiple sources to make research accessible.
September 1, 2023–The HDF Group has been selected to receive a Department of Energy grant to develop a platform where data from different fusion devices is managed according to Findable, Interoperable, Accessible, and Reusable (FAIR) standards and UNESCO’s Open Science recommendations. The data will also be adapted for use with machine learning (ML) tools. Led by researchers at MIT, this collaborative project also includes Auburn University, William & Mary, and the University of Wisconsin-Madison.
The HDF Group will provide technical support in data management and infrastructure development enabling specific tasks related to the data infrastructure development, including the development of the FAIR Fusion Ontology and Data Model and achieving performance improvements in MDSplus, the framework for storing and accessing experimental magnetic fusion energy data and its associated metadata. The primary focus of The HDF Group will be to develop a data platform to support machine learning applications using fusion plasma data originating from different sources and in heterogeneous formats. The data service will abstract a variety of data storage backends to a single application programming interface (API).
The data service will be managed and deployed in an infrastructure-as-code (IaC) manner. Using an IaC process increases reproducibility, interoperability, portability, encourages sharing, and contributes to the FAIRness of the entire software stack as these configuration files can be stored in code repositories, such as GitHub. Finally, a profile of MDSplus data in HDF5 will be developed. This will propagate FAIRness to the data storage and access.
Leading this project for The HDF Group is Dr. Aleksandar Jelenak, senior informatics architect. Jelenak brings decades of experience as a geoscientist, a field where sharing data from many sources has always been an integral part of the discipline. “My role on this project will be helping to develop the software for handling data and designing an effective scientific data management workflow while following the FAIR principals,” said Jelenak. “We have been a part of this effort in the geoscience community, and we are bringing those concepts to fusion plasma research.”
The proposed data platform development, leveraging existing MDSplus capabilities in the fusion community as well as the HDF5 capability handling voluminous and complex data in high-performance and cloud computing, will address both the challenges faced by next generation fusion plasma devices as well as predictive modeling all within an Open and FAIR environment.
Paraphrasing one of HDF5’s founders, The HDF Group’s Executive Director, Gerd Heber, comments, “Like source code is primarily aimed at inter-human communication, HDF5 was created for data sharing, and to serve as a lingua franca of data. We are thrilled to work with this distinguished multi-institutional team, and to follow our mission of serving science and society at large.”
About The HDF Group
The HDF Group is a non-profit organization with the mission of advancing state-of-the-art open-source data management technologies, ensuring efficient and equitable long-term access to scientific and engineering data, and supporting our dedicated and diverse user community around the world.
The HDF Group develops and maintains a suite of open-source data management technologies for platforms including mobile devices, the cloud, and supercomputers. This suite includes the globally popular HDF4 and HDF5 libraries and file formats, and the Highly Scalable Data Service, which will be instrumental in strengthening and improving MDSplus capabilities.