Research

The HDF Group’s technical mission is to provide rapid, easy and permanent access to complex data. FishEye's vision is "Synthesizing the world’s real-time data". This white paper is intended for embedded system users, software engineers, integrators, and testers that use or want to use HDF5 to access, collect, use and analyze machine data. FishEye has developed an innovative process that provides the most efficient method to expose data from embedded systems that simplifies and liberates data for real-time analysis, machine learning, and cloud-enabled services....

We are pleased to post this white paper from The HDF Group intern, Chen Wang. This paper looks at the steps of analyzing and tuning the HACC-IO benchmarks, the impact of different access patterns, stripe settings and HDF5 metadata. It also compares the five benchmarks on two different parallel file systems, Lustre and GPFS and shows that HDF5 with proper optimizations can catch up the pure MPI-IO implementations. An I/O Study of ECP Applications...

On February 6, 2020, the members of the ECP ExaIO project, Elena Pourmal and Scot Breitenfeld (The HDF Group), Quincey Koziol (NERSC), and Suren Byna (LBNL), presented HDF5 Tutorial at the ECP Annual Meeting. We've posted the slide deck and Q&A for your convenience....

Why do we use HDF5? We moved to HDF5 for our simulation data in 2016 from using our own proprietary file format. HDF5 had been on our radar for some time and we spent a couple of years investigating it and other file formats before deciding which we should switch to. HDF5 met all the criteria we had at the time. Amongst the criteria were: performance in speed and size, an accepted standard for scientific data, being open source, providing additional tools....

With MyIRE and HDF, everyone from a doctor in a small-town office to big data genetics researchers can work together to find new and powerful insights across data sets using a common set of tools - and do so in a repeatable way. And, because of HDF5, any user—whether large or small—is powered by the same technology used by CERN and NASA. We knew we wanted all of MyIRE’s users to have the power of NASA in their pocket. HDF5 made that possible....

The topic of software citation has been discussed in many forums recently and several major discovery repositories (e.g. zenodo and DataCite) support metadata for software in addition to datasets and other resource types. HDF5 stradles the boundary between the dataset and software worlds. It is most commonly thought of and referred to as a data format, but, as in any case, data written in the HDF formats can not be read without HDF software. So, the answer to the question: is it a format or is it software? is clearly both....

Tobias Weinzierl, Durham University, UK, Sven Köppel, FIAS, Germany, Michael Bader, TUM, Germany, HDF Guest Bloggers ExaHyPE develops a solver engine for hyperbolic differential equations solved on adaptive Cartesian meshes. It supports various HDF5 output formats. Exascale computing is expected to allow scientists and engineers to simulate, and ultimately understand, wave phenomena with unprecedented accuracy over unprecedented time spans. To harvest the power of exascale machines, well-suited software however has to become available. ExaHyPE is a H2020 project writing a PDE solver engine—similar to a 3D computer game engine—that will allow groups with a decent CSE expertise to write their own solver for hyperbolic equation systems within a year. The resulting solver will scale to exascale. This is made possible by the unique...