Technical Insights

On February 12, 2021, we were pleased to host Lucas Villa Real of IBM Research to discuss his project HDF5-UDF, a data virtualization tool for HDF5. The tool enables users to associate logic in source code form (i.e., in user-defined functions, written in Python, C/C++, or Lua) with HDF5 datasets. Such UDFs are compiled into a binary form (which often takes no more than a few KB) and embedded into HDF5; once an application reads such a dataset, HDF5-UDF executes that binary code and generates the data on-the-fly. Lucas has just released HDF5-UDF 1.2 which offers several new features: among other benefits, it makes it possible to easily virtualize CSV files so they look like regular HDF5 datasets. Attached you'll find the slide deck...

The HDF Group’s technical mission is to provide rapid, easy and permanent access to complex data. FishEye's vision is "Synthesizing the world’s real-time data". This white paper is intended for embedded system users, software engineers, integrators, and testers that use or want to use HDF5 to access, collect, use and analyze machine data. FishEye has developed an innovative process that provides the most efficient method to expose data from embedded systems that simplifies and liberates data for real-time analysis, machine learning, and cloud-enabled services....

HSDS (Highly Scalable Data Service) is a REST-based service for reading and writing HDF data. Initially developed as a NASA Access 2015 project, the HDF Group has continued to invest in the project, and as we'll see, the latest version has a bevy of new and interesting features....

We are pleased to post this white paper from The HDF Group intern, Chen Wang. This paper looks at the steps of analyzing and tuning the HACC-IO benchmarks, the impact of different access patterns, stripe settings and HDF5 metadata. It also compares the five benchmarks on two different parallel file systems, Lustre and GPFS and shows that HDF5 with proper optimizations can catch up the pure MPI-IO implementations. An I/O Study of ECP Applications...

On February 6, 2020, the members of the ECP ExaIO project, Elena Pourmal and Scot Breitenfeld (The HDF Group), Quincey Koziol (NERSC), and Suren Byna (LBNL), presented HDF5 Tutorial at the ECP Annual Meeting. We've posted the slide deck and Q&A for your convenience....

The HDF5 European workshop, co-organized with ESRF, and sponsored by OpenIO and Omnibond took place on September 17-18, 2019. This event covered the latest HDF5 developments, HDF5 use cases from science and industry, and HDF5 Applications and Tools. This post is an archive of the recorded presentations of this event. ...