Blog

HDF5 Tutorial at the 2020 ECP Annual Meeting

On February 6, 2020, the members of the ECP ExaIO project, Elena Pourmal and Scot Breitenfeld (The HDF Group), Quincey Koziol (NERSC), and Suren Byna (LBNL), presented HDF5 Tutorial at the ECP Annual Meeting.

 

The goal of this Tutorial was to introduce HDF5 to the new users, discuss the best practices when using HDF5, and update the current HDF5 users on the recent achievements and new additions to HDF5 software the ExaIO project contributed.

The Tutorial consisted of three parts. Part I introduced the HDF5 data model, and APIs for organizing data and performing I/O, and the best practices when using HDF5.  Part II gave an overview of parallel file systems  and showed how to use HDF5 in a parallel environment to perform I/O on a shareable file or multiple HDF5 files. Part III used examples from well-known codes and use cases from experimental sciences to demonstrate the tuning techniques such as collective metadata I/O, data aggregation, parallel compression and new HDF5 features that help to utilize HPC storage beyond current files systems (Data Elevator and UnifyFS).

We had about 25 attendees; many good questions were asked during 3 hours of Tutorial. We tried to capture those during the session and we are happy to share our Tutorial (PDF of slide deck) and Q and A materials (below) with the wider community now. Please don’t hesitate to contact help -at- hdfgroup.org if you have any questions about HDF5. We are always happy to help!

Elena Pourmal (epourmal -at- hdfgroup.org)
Scot Breitenfeld  (brtnfld-at- hdfgroup.org)
Quincey Koziol (koziol -at- lbl.gov)
Suren Byna (sbyna -at- lbl.gov)

 

Q&A Session

https://tinyurl.com/uoxkwaq (Google document of live Q&A)

  • Hi, will the slides be available? 
  • Do you support any lossy compression methods?
    • Not currently, although several 3rd-party I/O filters have been written for lossy compression methods:  https://support.hdfgroup.org/services/filters.html   It’s a little tricky how to express lossy compression in HDF5, since the API allows for dataset elements to be overwritten, compounding loss of information when multiple overwrites occur.  Certainly not a blocker, and many applications don’t overwrite data, but something to be aware of.
    • Thanks.  That make sense.
  • Does HDF5 support automatic conversion of language Class types to storage types
  • Will there be any slides discussing debugging of parallel io — how to debug a parallel hang or tracing or ??
    • Yes, Scot will talk about it
  • HDF5 VOL connectors repo
  • Tracing with Darshan
  • How to figure out which HDF5 metadata call is not being called by all ranks:
    • Set the “H5_COLL_API_SANITY_CHECK” environment variable to “1”:  “setenv H5_COLL_API_SANITY_CHECK=1” and the HDF5 library will perform an MPI_Barrier() call inside each metadata operation that modifies the HDF5 namespace.   It will be slow, but much easier to debug and see which rank is hanging in the MPI barrier.
  • Independent I/O for attributes: yay! We write ~3-10 attributes per dataset and group
  • Are there any initiatives to unify the description of bool and complex types in the community?
    • h5py’s conventions are very popular (for labels on enums & compounds) but e.g. pytables and some math libs do not necessarily follow them. I think if you just recommend the h5py way with examples on the homepage, a lot of projects would adopt and unify such trivial yet portability-hindering details for good.

No Comments

Leave a Comment