Blog

The HDF Group hosted a webinar presented by the Hermes team on Friday, August 5, 2022. Hermes is a distributed I/O buffering system for deep distributed storage hierarchies, which are commonly found on modern HPC systems. This webinar will highlight an exciting new component of the library, the Buffer Organizer, which will be released in Hermes 0.8.0-beta. Slide Deck https://www.youtube.com/watch?v=MMIlpx5YqFU  ...

If you are looking to store HDF5 data in the cloud there are several different technologies that can be used and choosing between them can be somewhat confusing. In this post, I thought it would be helpful to cover some of the options with the hope of helping HDF users make the best decision for their deployment. Each project will have its own requirements and special considerations, so please take this as just a starting point....

The HDF Group will be hosting a webinar presented by the Hermes team on Friday, August 5, 2022 at 11:00 a.m. Hermes is a distributed I/O buffering system for deep distributed storage hierarchies, which are commonly found on modern HPC systems. This webinar will highlight an exciting new component of the library, the Buffer Organizer, which will be released in Hermes 0.8.0-beta. Register now.  ...

The Highly Scalable Data Service (HSDS) runs as a set of containers in Docker (or pods in Kubernetes) and like all things Docker, each container instance is created based on a container image file. Unlike say, a library binary, the container image includes all the dependent libraries needed for the container to run. In this blog post, HSDS senior architect John Readey explains how to get HSDS running in a Docker container or Kubernetes pod, and gives some tips and tricks to ensure everything runs smoothly for you. ...

M. Scot Breitenfeld, HDF application support specialist and software engineer at The HDF Group, will present a session, Introduction to HDF5 for HPC Data Models, Analysis, and Performance on July 27, 2022. Scot's talk offers a comprehensive overview of HDF5 for anyone who works with big data in an HPC environment. The talk consists of two parts. Part I introduces the HDF5 data model and APIs for organizing data and performing I/O. Part II focuses on HDF5 advanced features such as parallel I/O and will give an overview of various parallel HDF5 tuning techniques such as collective metadata I/O, data aggregation, async, parallel compression, and other new HDF5 features that help to utilize HPC storage to its fullest potential. Please register to attend Scot's...

Accessing large data stores over the internet can be rather slow, but often you can speed things up using multiprocessing—i.e. running multiple processes that divvy up the work needed. Even if you run more processes than you have cores on your computer, since much of the time each process will be waiting on data, in many cases you'll find things speed up nicely....