The HDF Community

The agenda for the 2021 HDF5 User Group meeting has been posted. This event, scheduled with US time zones in mind, runs each day October 12-14 from about 9:00 a.m. central to 1:30 p.m. central time and features a variety of speakers on topics including the HDF5 ecosystem, apps, and features. This is event is free and online only; registration is required. We look forward to having you join us!...

On Friday, May 14, 2021, we hosted JP Swinski as he presented the webinar, H5Coro: The HDF5 Cloud-Optimized Read-Only Library. NASA’s migration of science data products and services to AWS has sparked a debate on the best way to access science data stored in the cloud. Given that a large portion of NASA’s science data is in the HDF5 format or one of its derivatives, a growing number of efforts are looking at ways to efficiently access H5 files residing in S3. This presentation describes one of those efforts and argues for the creation of a standardized subset of the HDF5 specification targeting cloud environments. Slide deck https://youtu.be/RuBKDW4TNO4...

May 28, 2021 11:00 a.m. CDT Register In this webinar, we continue our exploration of Hermes. We will pick up where we left from the first webinar in which we showcased the Hermes API basics & adapters and showed how applications could benefit by using Hermes. We will run the popular VPIC kernel with Hermes and highlight the performance improvements due to the hierarchical buffering system. We will also present an HDF5 Virtual File Driver (VFD) for Hermes. All demonstrations will be accompanied by detailed performance analysis, and there will be time for audience questions and answers. Resources: - Previous Webinar Recording: https://www.youtube.com/watch?v=KJdXMqfRmS4 - Hermes on GitHub: https://github.com/HDFGroup/hermes Register now!...

H5Coro: The HDF5 Cloud-Optimized Read-Only Library Presented by JP Swinski Friday, May 14, 2021 11:00 a.m. CDT Abstract NASA’s migration of science data products and services to AWS has sparked a debate on the best way to access science data stored in the cloud.  Given that a large portion of NASA’s science data is in the HDF5 format or one of its derivatives, a growing number of efforts are looking at ways to efficiently access H5 files residing in S3.  This presentation describes one of those efforts and argues for the creation of a standardized subset of the HDF5 specification targeting cloud environments. Please register to join us....

On March 26, 2021, The HDF Group hosted the Hermes development team to learn more about the Hermes project. Abstract In this webinar, we will provide an update on an NSF-funded joint effort between the Illinois Institute of Technology (IIT) and The HDF Group. We will explain Hermes’ goals and what differentiates it from existing technologies. We will introduce our team members, and present Hermes’ current status including the abstractions involved, high-level API, as well as its system architecture. Different team members will present the results they’ve obtained under this project. Finally, we’ve prepared a few demonstrations, and we will show how you can get involved. Slide Deck https://youtu.be/KJdXMqfRmS4...

Suren Byna, Elena Pourmal, Lori Cooper  Overview HDF5 has been a widely used tool to simplify management and access to scientific and engineering data with ubiquitous data solutions. With rapidly growing data across all domains of science and industry, HDF5 developers have been building technologies that provide rapid, easy, and permanent access to complex data. The HDF Group, a non-profit organization, has been the driving force behind developing and maintaining the HDF5 software library for more than two decades.  HDF5 has been extremely successful with a wide range of users and an ecosystem built around the HDF5 library. With the goal of facilitating a broader discussion among HDF5 users, The HDF Group and Lawrence Berkeley National Laboratory (LBNL) teamed up to host a...

On March 26, 2021 at 11:00 CDT, we will present the webinar, Hermes - A Distributed Buffering System for Heterogeneous Storage Hierarchies. Abstract In this webinar, we will provide an update on an NSF-funded joint effort between the Illinois Institute of Technology (IIT) and The HDF Group. We will explain Hermes' goals and what differentiates it from existing technologies. We will introduce our team members, and present Hermes' current status including the abstractions involved, high-level API, as well as its system architecture. Different team members will present the results they've obtained under this project. Finally, we've prepared a few demonstrations, and we will show how you can get involved. Register...

On January 19, 2021 The HDF Group participated in the NCSA’s Webinar series, The University of Illinois New Frontiers Initiative Webinar Series with sessions presented by Gerd Heber, John Readey, and Aleksander Jelenak. HDF Technologies and Resources for Geospatial Data Abstract: The HDF Group created HDF at NCSA in 1988 to enable scientists and engineers to describe, store, and access large, complex data structures and collections. Since then we have worked with data producers, providers, and users all over the world and in every discipline to develop and evolve HDF to meet the needs of changing technologies and applications. Applications as diverse as gravity wave detection, gene sequencing, and finance use the HDF5 data model and software to acquire and share data and solve...

On February 12, 2021, we were pleased to host Lucas Villa Real of IBM Research to discuss his project HDF5-UDF, a data virtualization tool for HDF5. The tool enables users to associate logic in source code form (i.e., in user-defined functions, written in Python, C/C++, or Lua) with HDF5 datasets. Such UDFs are compiled into a binary form (which often takes no more than a few KB) and embedded into HDF5; once an application reads such a dataset, HDF5-UDF executes that binary code and generates the data on-the-fly. Lucas has just released HDF5-UDF 1.2 which offers several new features: among other benefits, it makes it possible to easily virtualize CSV files so they look like regular HDF5 datasets. Attached you'll find the slide deck...