Webinar Follow-up: Subfiling and Multiple dataset APIs: An introduction to two new features in HDF5 version 1.14 - The HDF Group - ensuring long-term access and usability of HDF data and supporting users of HDF technologies

On September 30, Neil Fortner, Jordan Henderson, and Scot Breitenfeld of The HDF Group presented Subfiling and Multiple dataset APIs: An introduction to two new features in HDF5 version 1.14.

For parallel I/O, the principle behind Subfiling is to find the middle ground between a single shared file and one file per process, thereby avoiding the complexity of one file per process and minimizing the locking issues of a single shared file on a parallel file system. The first part of the talk will cover Subfiling’s implementation, its usage, and the performance benefits observed compared to a single shared file.

The second part of the talk will introduce new HDF5 multiple dataset APIs and highlight the performance benefits when using them. The HDF5 library allows a data access operation to access one dataset at a time. However, accessing multiple datasets requires the user to issue an I/O call for each dataset. Hence, the new multiple dataset APIs allow users to access multiple datasets with a single I/O call. In addition, the new routines can improve performance, especially when data is accessed across several datasets from all processes.

Slide decks:

Leave a Comment Cancel Reply