Research

The HDF Group has been selected to receive a Department of Energy grant to develop a platform where data from different fusion devices is managed according to Findable, Interoperable, Accessible, and Reusable (FAIR) standards and UNESCO’s Open Science recommendations. The data will also be adapted for use with machine learning (ML) tools. Led by researchers at MIT, this collaborative project also includes Auburn University, William & Mary, and the University of Wisconsin-Madison....

The HDF Group's formal comment to a DOE request on stewardship for scientific and high-performance computing was published in the Federal Register. We had a joint position paper at January's ASCR Workshop on Visualization for Scientific Discovery, Decision-Making, & Communication. The paper is called Whither Visualization Logic and was written by Leigh Orf (University of Wisconsin), Lucas Villa Real (IBM Research), and Gerd Heber (The HDF Group). Thomas Caswell (active h5py contributor, Brookhaven National Laboratory) also has a position paper called Visualization of Structured Data. Additionally, we have two position papers at the ASCR Workshop on the Management and Storage of Scientific Data. The first was led by Jerome Soumagne (The HDF Group), The Twilight of I/O as a User Concept and is joint work with Andres Marquez from the PNNL....

The group at BioSimulations.org has been doing some very interesting work using HSDS on Kubernetes to store biomodelling data and visualizing the results using Vega as described in the paper below. Biosimulations chose to use HSDS due to its support for very large data sets,  REST API (for use with web applications), and its ability to run on Google Cloud as well as on-premise installations. ...

The HDF Group’s technical mission is to provide rapid, easy and permanent access to complex data. FishEye's vision is "Synthesizing the world’s real-time data". This white paper is intended for embedded system users, software engineers, integrators, and testers that use or want to use HDF5 to access, collect, use and analyze machine data. FishEye has developed an innovative process that provides the most efficient method to expose data from embedded systems that simplifies and liberates data for real-time analysis, machine learning, and cloud-enabled services....

We are pleased to post this white paper from The HDF Group intern, Chen Wang. This paper looks at the steps of analyzing and tuning the HACC-IO benchmarks, the impact of different access patterns, stripe settings and HDF5 metadata. It also compares the five benchmarks on two different parallel file systems, Lustre and GPFS and shows that HDF5 with proper optimizations can catch up the pure MPI-IO implementations. An I/O Study of ECP Applications...

On February 6, 2020, the members of the ECP ExaIO project, Elena Pourmal and Scot Breitenfeld (The HDF Group), Quincey Koziol (NERSC), and Suren Byna (LBNL), presented HDF5 Tutorial at the ECP Annual Meeting. We've posted the slide deck and Q&A for your convenience....