Research

With MyIRE and HDF, everyone from a doctor in a small-town office to big data genetics researchers can work together to find new and powerful insights across data sets using a common set of tools - and do so in a repeatable way. And, because of HDF5, any user—whether large or small—is powered by the same technology used by CERN and NASA. We knew we wanted all of MyIRE’s users to have the power of NASA in their pocket. HDF5 made that possible....

The topic of software citation has been discussed in many forums recently and several major discovery repositories (e.g. zenodo and DataCite) support metadata for software in addition to datasets and other resource types. HDF5 stradles the boundary between the dataset and software worlds. It is most commonly thought of and referred to as a data format, but, as in any case, data written in the HDF formats can not be read without HDF software. So, the answer to the question: is it a format or is it software? is clearly both....

Tobias Weinzierl, Durham University, UK, Sven Köppel, FIAS, Germany, Michael Bader, TUM, Germany, HDF Guest Bloggers ExaHyPE develops a solver engine for hyperbolic differential equations solved on adaptive Cartesian meshes. It supports various HDF5 output formats. Exascale computing is expected to allow scientists and engineers to simulate, and ultimately understand, wave phenomena with unprecedented accuracy over unprecedented time spans. To harvest the power of exascale machines, well-suited software however has to become available. ExaHyPE is a H2020 project writing a PDE solver engine—similar to a 3D computer game engine—that will allow groups with a decent CSE expertise to write their own solver for hyperbolic equation systems within a year. The resulting solver will scale to exascale. This is made possible by the unique...

Scot Martin, Harvard University, HDF Guest Blogger HDF5 storage is really interesting. To me, its format has no fixed structure, but instead is based on introspection and discovery. Seems great to me; Mathematica has its origins first in artificial intelligence, so we ought to be able to do something here.  Approaching twenty-two years with Mathematica and almost a “Hello, World!” ability in C, I decided to jump right in. Enter The HDF Group's P/Invoke for my salvation. Here’s how we make use of it in Mathematica: LoadNETAssembly["HDF.PInvoke.dll"] Bang! Ready to go in Mathematica. Here’s a proof of concept for how it works: Module[ (* The three symbols should have initial values so that there is *) (* memory allocation when Mathematica interfaces with P/Invoke. *) {major=0,minor=0,revision=0,return}, CompoundExpression[ (* access...

Christian Hoene, Symonics GmbH; and Piotr Majdak, Acoustics Research Institute; HDF Guest Bloggers Spatial audio - 3D sound.  Back in the ‘70’s, “dummy head” microphones were used to create spatial audio recordings. With headphones, one was able to listen to those recordings and marvel at the impressive spatial distribution of sounds – just like in real life. [caption id="attachment_11132" align="aligncenter" width="624"] Displays the difference between listening to a real source and listening to realistic virtual sounds via headphones[/caption] Nowadays, we have a much better understanding of the human binaural perception and we can even simulate spatial audio signals with the help of computers.  Indeed, a modern virtual reality (VR) headset such as the Oculus Rift or Samsung Gear utilizes 3D audio to allow...

Mark Miller, Lawrence Livermore National Laboratory, Guest Blogger The HDF5 library has supported the I/O requirements of HPC codes at Lawrence Livermore National Labs (LLNL) since the late 90’s. In particular, HDF5 used in the Multiple Independent File (MIF) parallel I/O paradigm has supported LLNL code’s scalable I/O requirements and has recently been gainfully used at scales as large as 1,000,000 parallel tasks. What is the MIF Parallel I/O Paradigm? In the MIF paradigm, a computational object (an array, a mesh, etc.) is decomposed into pieces and distributed, perhaps unevenly, over parallel tasks. For I/O, the tasks are organized into groups and each group writes one file using round-robin exclusive access for the tasks in the group. Writes within groups are serialized but...

DOE has continued to partner with The HDF Group, supporting development of HDF5 through two generations of computing; sponsoring this development has benefited the entire HDF5 user community. Today, DOE supports current HDF5 R&D to ensure that the data challenges of third generation exascale computing ...

MuQun (Kent) Yang, The HDF Group

Many NASA HDF and HDF5 data products can be visualized via the Hyrax OPeNDAP server through Hyrax’s HDF4 and HDF5 handlers.  Now we’ve enhanced the HDF5 OPeNDAP handler so that SMAP level 1, level 3 and level 4 products can be displayed properly using popular visualization tools.

Organizations in both the public and private sectors use HDF to meet long term, mission-critical data management needs. For example, NASA’s Earth Observing System, the primary data repository for understanding global climate change, uses HDF.  Over the lifetime of the project, which began in 1999, NASA has stored 15 petabytes of satellite data in HDF which will be accessible by NASA data centers and NASA HDF end users for many years to come.

In a previous blog, we discussed the concept of using the Hyrax OPeNDAP web server to serve NASA HDF4 and HDF5 products.  Each year, The HDF Group has enhanced the HDF4 and HDF5 handlers that work within the Hyrax OPeNDAP framework to support all sorts of NASA HDF data products, making them interoperable with popular Earth Science tools such as NASA’s Panoply and UCAR’s IDV.  The Hyrax HDF4 and HDF5 handlers make data products display properly using popular visualization tools.