A Turing Test for HDF5

by John Readey “For Heaven’s sake, do not confound HDF5 with anything else!” —Friedrich Nietzsche, Ecce Homo Alan Turing devised the Turing test (he referred to it as the “Imitation Game”) in 1950 to determine whether a machine could think. He believed that if machines could give answers that were indistinguishable from human answers, then […]

Atmos Data Store project: A Highly Scalable Data Service (HSDS) Use Case

The Atmos Data Store team from Equinor have been working with The HDF Group over the last couple of years to incorporate the Highly Scalable Data Service (HSDS) as part of their system for managing hundreds of terabytes of metocean (meterological and oceangraphic) data. They’ve graciously agreed to provide some background on their project and

Speed up cloud access using multiprocessing!

Accessing large data stores over the internet can be rather slow, but often you can speed things up using multiprocessing—i.e. running multiple processes that divvy up the work needed. Even if you run more processes than you have cores on your computer, since much of the time each process will be waiting on data, in many cases you’ll find things speed up nicely.

BioSimulations: a platform for sharing and reusing biological simulations

The group at BioSimulations.org has been doing some very interesting work using HSDS on Kubernetes to store biomodelling data and visualizing the results using Vega as described in the paper below. Biosimulations chose to use HSDS due to its support for very large data sets,  REST API (for use with web applications), and its ability to run on Google Cloud as well as on-premise installations. 

New features in HSDS version 0.6

HSDS (Highly Scalable Data Service) is a REST-based service for reading and writing HDF data. Initially developed as a NASA Access 2015 project, the HDF Group has continued to invest in the project, and as we’ll see, the latest version has a bevy of new and interesting features.

Large wind dataset now available via HDF Cloud

50TB of Wind Integration National Dataset (WIND) toolkit data is now available to anyone via HDF Cloud thanks to the work and collaboration between John Readey, Sr. Architect at The HDF Group and NREL (the National Renewable Energy Laboratory). Access the data now with a Jupyter Notebook or through the interactive web-based visualization tool. If you want

The GFED Analysis Tool – An HDF Server Implementation

The HDF Server allows producers of complex datasets to share their results with a wide audience base. We used it to develop the Global Fire Emissions Database (GFED) Analysis Tool, a website which guides the user through our dataset. A simple webmap interface allows users to select an area of interest and produce data visualization charts.

Scroll to Top