Innovation

On March 30th, we announced the release of HDF5 version 1.10.2. With this release, we accomplished all the tasks planned for the major HDF5 1.10 series. It is time for applications to start migrating (or start their migration) from HDF5 1.8 to the new major release as we will be dropping support for HDF5 1.8 in Summer 2019. In this blog post , we will focus only on the major new features and bug fixes in HDF5 1.10.2. Hopefully, after reading about those, you will be convinced it is time to upgrade to HDF5 1.10.2....

50TB of Wind Integration National Dataset (WIND) toolkit data is now available to anyone via HDF Cloud thanks to the work and collaboration between John Readey, Sr. Architect at The HDF Group and NREL (the National Renewable Energy Laboratory). Access the data now with a Jupyter Notebook or through the interactive web-based visualization tool. If you want to learn more, please click here to read the Amazon blog where John and his NREL collaborators discuss the technology and motivation behind this project....

By Francesc Alted. He is a freelance consultant and developing author of different open source libraries like PyTables, Blosc, bcolz and numexpr and an experienced programmer in Python and C. Francesc collaborates regularly with the The HDF Group in different projects. We explain our solution for handling big data streams using HDF5 (with a little help from other tools). In the ubiquitously connected world that we live in, there are good reasons to understand what data is transferred across a network and how to extract information out of it. Being able to capture and log the different network packets can be used for many tasks including: Protecting against cyber threats Enforcing policy Extracting and consolidating valuable information Debugging protocols/services Understanding how users use...

The HDF Server allows producers of complex datasets to share their results with a wide audience base. We used it to develop the Global Fire Emissions Database (GFED) Analysis Tool, a website which guides the user through our dataset. A simple webmap interface allows users to select an area of interest and produce data visualization charts. ...

Mark Miller, Lawrence Livermore National Laboratory, Guest Blogger The HDF5 library has supported the I/O requirements of HPC codes at Lawrence Livermore National Labs (LLNL) since the late 90’s. In particular, HDF5 used in the Multiple Independent File (MIF) parallel I/O paradigm has supported LLNL code’s scalable I/O requirements and has recently been gainfully used at scales as large as 1,000,000 parallel tasks. What is the MIF Parallel I/O Paradigm? In the MIF paradigm, a computational object (an array, a mesh, etc.) is decomposed into pieces and distributed, perhaps unevenly, over parallel tasks. For I/O, the tasks are organized into groups and each group writes one file using round-robin exclusive access for the tasks in the group. Writes within groups are serialized but...

Dave Pearah, The HDF Group In my previous post—HDF: The Next 30 Years (Part 1)—I outlined the challenges and opportunities facing The HDF Group as an open source company. In a nutshell: Opportunity: large-scale adoption around the world in many different industries with great community-driven development (700+ projects in Github) Challenge: sufficient profit from existing business (consulting) to sustainably extend and maintain the core HDF5 platform The HDF Group is blessed with an amazingly talented + passionate + dedicated team of folks who care deeply about the HDF community, and we're all working together to determine the best path forward to sustainability, i.e. the NEXT 30 years. We want to share some of the steps that we're already taking, and -- more importantly --...

The HDF Group’s HDF Server has been nominated for Best Use of HPC in the Cloud, and Best HPC Software Product or Technology in HPCWire’s 2016  Readers’ Choice Awards. HDF Server is a Python-based web service that enables full read/write web access to HDF data – it can be used to send and receive HDF5 data using an HTTP-based REST interface. While HDF5 provides powerful scalability and speed for complex datasets of all sizes, many HDF5 data sets used in HPC environments are extremely large and cannot easily be downloaded or moved across the internet to access data on an as-needed basis.  Users often only need to access a small subset of the data.  Using HDF Server, data can be kept in one...

We are currently planning for a Q2 2016 release of the product. In the meantime, we are working with a few early adopters on finalizing the initial feature set. If you have additional questions about HDF5/ODBC, or if you would like to become an early adopter, please contact us ...

  • 1