The HDF Community

Suren Byna, Elena Pourmal, Lori Cooper  Overview HDF5 has been a widely used tool to simplify management and access to scientific and engineering data with ubiquitous data solutions. With rapidly growing data across all domains of science and industry, HDF5 developers have been building technologies that provide rapid, easy, and permanent access to complex data. The HDF Group, a non-profit organization, has been the driving force behind developing and maintaining the HDF5 software library for more than two decades.  HDF5 has been extremely successful with a wide range of users and an ecosystem built around the HDF5 library. With the goal of facilitating a broader discussion among HDF5 users, The HDF Group and Lawrence Berkeley National Laboratory (LBNL) teamed up to host a...

On March 26, 2021 at 11:00 CDT, we will present the webinar, Hermes - A Distributed Buffering System for Heterogeneous Storage Hierarchies. Abstract In this webinar, we will provide an update on an NSF-funded joint effort between the Illinois Institute of Technology (IIT) and The HDF Group. We will explain Hermes' goals and what differentiates it from existing technologies. We will introduce our team members, and present Hermes' current status including the abstractions involved, high-level API, as well as its system architecture. Different team members will present the results they've obtained under this project. Finally, we've prepared a few demonstrations, and we will show how you can get involved. Register...

On January 19, 2021 The HDF Group participated in the NCSA’s Webinar series, The University of Illinois New Frontiers Initiative Webinar Series with sessions presented by Gerd Heber, John Readey, and Aleksander Jelenak. HDF Technologies and Resources for Geospatial Data Abstract: The HDF Group created HDF at NCSA in 1988 to enable scientists and engineers to describe, store, and access large, complex data structures and collections. Since then we have worked with data producers, providers, and users all over the world and in every discipline to develop and evolve HDF to meet the needs of changing technologies and applications. Applications as diverse as gravity wave detection, gene sequencing, and finance use the HDF5 data model and software to acquire and share data and solve...

On February 12, 2021, we were pleased to host Lucas Villa Real of IBM Research to discuss his project HDF5-UDF, a data virtualization tool for HDF5. The tool enables users to associate logic in source code form (i.e., in user-defined functions, written in Python, C/C++, or Lua) with HDF5 datasets. Such UDFs are compiled into a binary form (which often takes no more than a few KB) and embedded into HDF5; once an application reads such a dataset, HDF5-UDF executes that binary code and generates the data on-the-fly. Lucas has just released HDF5-UDF 1.2 which offers several new features: among other benefits, it makes it possible to easily virtualize CSV files so they look like regular HDF5 datasets. Attached you'll find the slide deck...

The HDF Group’s Gerd Heber hosts a weekly session where he tries to answer attendee questions and, for example, go over the previous week’s HDF Forum posts. The HDF Clinics are free sessions intended to help users tackle real-world HDF problems from a common cold to severe headaches and offer relief where that’s possible. As time permits, we will include how-tos, offer advice on tool usage, review your code samples, teach you survival in the documentation jungle, and discuss what’s new or just around the corner in the land of HDF. Please submit questions/topics in this google doc. One-time registration required. Thanks to all who attended our first HDF Clinic on February 9, 2021. The resources from this clinic are archived here. Gerd's Notes https://youtu.be/g5h_YlvI9Aw  ...

The HDF Group’s technical mission is to provide rapid, easy and permanent access to complex data. FishEye's vision is "Synthesizing the world’s real-time data". This white paper is intended for embedded system users, software engineers, integrators, and testers that use or want to use HDF5 to access, collect, use and analyze machine data. FishEye has developed an innovative process that provides the most efficient method to expose data from embedded systems that simplifies and liberates data for real-time analysis, machine learning, and cloud-enabled services....

Thank you to our co-participants at OpenIO for allowing us to work with you to create this webinar on object storage. If you're interested in the webinar recording or slide deck, you can access them here: https://www.openio.io/blog/webinar-switch-to-object-storage....

Many organizations have petabytes of HDF5 data stored on premise NAS systems, and while object storage systems are generally more cost effective than NAS, applications written on POSIX storage won’t “just work” with object storage. The HDF Group has taken the pain and expense out of this problem by developing open source libraries and services to enable HDF5 applications to transparently use object storage (on prem or in the cloud)—no modifications needed. Join us in this webinar to learn more....

John Readey, senior architect at The HDF Group recently worked with OpenIO to integrate the Highly Scalable Data Service (HSDS) with OpenIO’s storage. (You can read more about this integration here.) John will be presenting, along with Guillaume Delaporte, Co-founder and VP Pre-Sales at OpenIO, to discuss the technical and business implications of moving to object storage....