HDF5 in the Era of Exascale and Cloud Computing
The HDF Group and friends will be hosting a BOF at Supercomputing 2022 (SC22) on Wednesday, 16 November 2022 in the 12:15 pm – 1:15 pm CST session.
The HDF Group and friends will be hosting a BOF at Supercomputing 2022 (SC22) on Wednesday, 16 November 2022 in the 12:15 pm – 1:15 pm CST session.
When using HDF5 or HSDS you’ve likely benefited (even if you weren’t aware of it) caching features built into the software that can drastically improve performance. HSDS and h5pyd utilize caching to improve performance for service-based applications. In this post, we’ll do a quick review of how HDF5 library caching works and then dive into HSDS and h5pyd caching (with a brief discussion of web caching).
If you are looking to store HDF5 data in the cloud there are several different technologies that can be used and choosing between them can be somewhat confusing. In this post, I thought it would be helpful to cover some of the options with the hope of helping HDF users make the best decision for their deployment. Each project will have its own requirements and special considerations, so please take this as just a starting point.
The HDF Group and partners gave a talk “Sharing Experience on HDF5 VOL Connectors Development and Maintenance” on Wednesday, May 11, 1:00-2:00 PM ET as part of the ECP 2022 Community BOF days .
As part of our work with the Exascale Computing Project (ECP), Scot Breitenfeld, Hyo-Kyung Lee, and Larry Knox prepared this status report on the HDF5 VOL. This report provides an overview of the HDF5 VOL connectors created for the ECP.
This presentation was given at the 2022 Exascale Computing Project Annual meeting, as part of the tutorial “Using HDF5 Efficiently on HPC Systems.”
The purpose of this introduction is to highlight and celebrate a community contribution the impact of which we are just beginning to understand. Its principal author, Mr. Lucas C. Villa Real, calls it HDF5-UDF and describes it as “a mechanism to generate HDF5 dataset values on-the-fly using user-defined functions (UDFs).” This matter- of-fact characterization is quite accurate, but I would like to provide some context for what this means for us users of HDF5.
The Open Geospatial Consortium (OGC) membership has approved the Hierarchical Data Format Version 5 (HDF5) Core as an official OGC Standard. HDF5 provides a flexible, extensible, and efficient data model, programming interface, and storage model for keeping and managing spatial data.
David Ziganto is a senior data scientist and corporate trainer at Metis in Chicago, IL. This post was originally shared on his site at https://dziganto.github.io/, but we enjoyed it so much we wanted to share it with everyone.
The topic of software citation has been discussed in many forums recently and several major discovery repositories (e.g. zenodo and DataCite) support metadata for software in addition to datasets and other resource types. HDF5 stradles the boundary between the dataset and software worlds. It is most commonly thought of and referred to as a data format, but, as in any case, data written in the HDF formats can not be read without HDF software. So, the answer to the question: is it a format or is it software? is clearly both.