Blog

HDF Server: Best Use of HPC in the Cloud – vote in HPCwire’s 2016 Readers’ Choice Awards

The HDF Group’s HDF Server has been nominated for Best Use of HPC in the Cloud, and Best HPC Software Product or Technology in HPCWire’s 2016  Readers’ Choice Awards.

HDF Server is a Python-based web service that enables full read/write web access to HDF data – it can be used to send and receive HDF5 data using an HTTP-based REST interface.

While HDF5 provides powerful scalability and speed for complex datasets of all sizes, many HDF5 data sets used in HPC environments are extremely large and cannot easily be downloaded or moved across the internet to access data on an as-needed basis.  Users often only need to access a small subset of the data.  Using HDF Server, data can be kept in one central location and content vended via well-defined URIs.  HDF Server enables exploration and analysis of the data while minimizing the number of bytes that need to be transmitted over the network.

By providing a web services native interface to HDF5 data,

  1. data providers can consider hosting HDF5 data in a cloud/utility computing environment at a fraction of the existing cost. Moreover,
  2. HDF data hosted by the server can be made available to a global community of HPC-based computing users on a laptop browser from anywhere on the internet, breaking down barriers to expensive, traditional HPC.

With HDF5 Server, not only can web apps utilize data supplied by the server –  users will be able to develop web applications that are based on HDF5 with much less effort compared to previous technologies.

HDF Server supports CRUD (create, read, update, delete) operations on the full spectrum of HDF5 objects including: groups, links, datasets, attributes, and committed data types. As a REST service, a variety of clients can be developed in JavaScript, Python, C, and other common languages.

The HDF Server extends the HDF5 data model to efficiently store multi-TB data arrays and access them over the web using a RESTful API.  In addition, it provides:

  • Support for dataset queries
  • Pluggable authentication layer
  • Access control at the Group and Dataset level
  • Python clients can use the h5pyd package to access the server using a h5py-compatible API (Python client SDK (h5pyd) – compatible with h5py Python Package)
  • Runs on Linux, OS X, and Windows
  • Implementation is Open Source  (www.github.com/HDFGroup/h5serv)

Since HDF Server supports the full range of HDF data, providing read/write access for most HDF types, it enables some interesting scenarios such as:

  • Collaborative annotation of data sets
  • Compute clusters that use HDF Server for data access
  • Continually updated data shares (e.g. Stock Market quotes, sensor data)
  • Storing the analysis of the data (say a visualization) with the original source data

HDF Server Upcoming Features:

  • Support for object storage (e.g. AWS S3)
  • Enable extremely high transaction and throughput levels
  • HDF5 API compatible library for C and Fortran users

As an example of HDF Server’s impact on the HPC community, The HDF Group’s collaboration with NASA on HDF Server is enabling NASA to advance its enterprise goal of shifting its computing to a more cloud-centric model intended to reduce NASA’s TCO for providing remote access to its massive stores of data.  It will enable greater use of NASA datasets in the public cloud and hence, better utilization of data for research and discovery.  By providing a cost effective data platform, potentially much more of the available data could be brought into the public cloud.

Also,  by removing the need to manage files and providing high throughput data access, data pipeline processes can be managed more efficiently.  For supercomputer centers, the outputs of simulation data could be off-loaded from expensive distributed file systems to private cloud object stores for analysis and research.

Each year the HPCwire Readers’ Choice Awards are determined by readers across the HPC community, to recognize the most outstanding individuals, organizations, products and technologies in the industry.

HPCwire readers who actively vote during the elections will decide the winners. Make your voice heard before it’s too late!  Polls are open for the HPCwire Readers’ Choice Awards until September 30, 2016.

Register with HPCwire as a reader and vote for HDF Server as Best User of HPC in the Cloud at https://www.hpcwire.com/2016-hpcwire-readers-choice-awards/.

Post Tags:

No Comments

Leave a Comment