Blog

We are currently planning for a Q2 2016 release of the product. In the meantime, we are working with a few early adopters on finalizing the initial feature set. If you have additional questions about HDF5/ODBC, or if you would like to become an early adopter, please contact us ...

John Readey, The HDF Group

We’re pleased to announce that The HDF Group is now a member of the Open Commons Consortium (formerly Open Cloud Consortium), a not for profit that manages and operates cloud computing and data commons infrastructure to support scientific, medical, health care and environmental research.

The HDF Group will be participating in the NOAA Data Alliance Working Group (WG) on the WG committee that will determine the datasets to be hosted in the NOAA data commons as well as tools to be used in the computational ecosystem surrounding the NOAA data commons.

OSDC website

“The Open Commons Consortium (OCC) is a truly innovative concept for supporting scientific computing,” said Mike Folk, The HDF Group’s President. “Their cloud computing and data commons infrastructure supports a wide range of research, and OCC’s membership spans government, academia, and the private sector.  This is a good opportunity for us to learn about how we can best serve these communities.”

The HDF Group will also participate in the Open Science Data Cloud working group and receive resource allocations on the OSDC Griffin resource.  The HDF Group’s John Readey is working with the OCC and others to investigate ways to use Griffin effectively.  Readey says, “Griffin is a great testbed for cloud-based systems.  With access to object storage (using the AWS/S3 api) and the ability to programmatically create VM’s, we will explore new methods for the analysis of scientific datasets.” 

Joel Plutchak, The HDF Group The HDF Group’s support for and use of the Java Programming Language consists of Java wrappers for the HDF4 and HDF5 C libraries, an Object Model definition and implementation, and HDFView, a graphical file viewing application. In this article we'll discuss what we’re doing now with Java, and look toward the future. [caption id="attachment_10769" align="alignright" width="300"] The screen capture shows some of the capabilities of the HDFView application. Displayed is a JPSS Mission VIIRS (Visible Infrared Imaging Radiometer Suite) Day-Night band dataset in table form and image form with false color palette attached.[/caption] By the time the first public version of the Java Programming Language was released in 1995, various groups at the University of Illinois were already...

Anthony Scopatz, Assistant Professor at the University of South Carolina, HDF guest blogger "Python is great and its ecosystem for scientific computing is world class. HDF5 is amazing and is rightly the gold standard for persistence for scientific data. Many people use HDF5 from Python, and this number is only growing due to pandas’ HDFStore. However, using HDF5 from Python has at least one more knot than it needs to.  Let’s change that." Almost immediately when going to use HDF5 from Python you are faced with a choice between two fantastic packages with overlapping capabilities: h5py and PyTables.  h5py wraps the HDF5 API more closely using autogenerated Cython.  PyTables, while also wrapping HDF5, focuses more on a Table data structure and adds...

Lindsay Powers, The HDF Group

The 2015 HDF workshop held during the ESIP Summer Meeting was a great success thanks to more than 40 participants throughout the four sessions.  The workshop was an excellent opportunity for us to interact with HDF community members to better understand their needs and introduce them to new technologies. You can view the slide presentations from the workshop here.

From my perspective, the highlight of the workshop was the Vendors and Tools Session where we heard from Ellen Johnson (Mathworks), Christine White (Esri), Brian Tisdale (NASA), and Gerd Heber (The HDF Group) talk about new, and improved applications of HDF technologies.  For example:  

Quincey Koziol, The HDF Group

“A supercomputer is a device for turning compute-bound problems into I/O-bound problems.” – Ken Batcher, Prof. Emeritus, Kent State University.

HDF5 began out of a collaboration between the National Center for Supercomputing Applications (NCSA) and the US Department of Energy’s Advanced Simulation and Computing Program (ASC), so high-performance computing (HPC) I/O has been in our focus from the very beginning.  As we are starting our 20th year of development on HDF5, HPC I/O continues to be a critical driver of new features.

Los Alamos National Laboratory is home to two of the world’s most powerful supercomputers, each capable of performing more than 1,000 trillion operations per second. Here, ASC is examining the effects of a one-megaton nuclear energy source detonated on the surface of an asteroid. Image from ASC at http://www.lanl.gov/asci/

The HDF5 development team has focused on three things when serving the HPC community: performance, freedom of choice and ease of use.

David Dotson, doctoral student, Center for Biological Physics, Arizona State University; HDF Guest Blogger

Recently I had the pleasure of meeting Anthony Scopatz for the first time at SciPy 2015, and we talked shop. I was interested in his opinions on MDSynthesis, a Python package our lab has designed to help manage the complexity of raw and derived data sets from molecular dynamics simulations, about which I was

Mohamad Chaarawi, The HDF Group

Second in a series: Parallel HDF5

NERSC’s Cray Sonexion system provides data storage for its Mendel scientific computing cluster.

In my previous blog post, I discussed the need for parallel I/O and a few paradigms for doing parallel I/O from applications. HDF5 is an I/O middleware library that supports (or will support in the near future) most of the I/O paradigms we talked about.

In this blog post I will discuss how to use HDF5 to implement some of the parallel I/O methods and some of the ongoing research to support new I/O paradigms. I will not discuss pros and cons of each method since we discussed those in the previous blog post.

But before getting on with how HDF5 supports parallel I/O, let’s address a question that comes up often, which is,

“Why do I need Parallel HDF5 when the MPI standard already provides an interface for doing I/O?”

Mohamad Chaarawi, The HDF Group

Second in a series: Parallel HDF5

NERSC’s Cray Sonexion system provides data storage for its Mendel scientific computing cluster.

In my previous blog post, I discussed the need for parallel I/O and a few paradigms for doing parallel I/O from applications. HDF5 is an I/O middleware library that supports (or will support in the near future) most of the I/O paradigms we talked about.

In this blog post I will discuss how to use HDF5 to implement some of the parallel I/O methods and some of the ongoing research to support new I/O paradigms. I will not discuss pros and cons of each method since we discussed those in the previous blog post.

But before getting on with how HDF5 supports parallel I/O, let’s address a question that comes up often, which is,

“Why do I need Parallel HDF5 when the MPI standard already provides an interface for doing I/O?”