Join The HDF Group for a webinar consisting of three short presentations from the HDF5 community to learn about different approaches and exciting work being done by the HDF5 C++ community members. Steven Varga, Martin Shetty and Eugen Wintersberger, and Chris Green and Marc Paterno will share their vision for and experiences using HDF5 with C++.
This webinar will be held Thursday, January 24th, 10:00 a.m. – 11:00 a.m. Central time.
H5CPP
Steven Varga
http://h5cpp.org/
Abstract
It is not surprising that the level at which HDF5 is the most flexible and performant (the C-API) is the least usable. While the Pythonic Wars have restored usability, the outcome did not expose HDF5’s raw power: low latency and high throughput.
H5CPP is a value-based solution and stands for choice when it comes to delivering performance. It embraces modern C++ idioms, and, with a template metaprogramming-based parser, provides a syntactic experience similar to Python—at zero performance cost. H5CPP supports popular linear algebra libraries aimed at data scientists, and its LLVM-based compiler brings introspection for C++, today. Finally, the STL support targets software writers who need dense, general purpose, portable, binary storage, aka HDF5.
In this presentation, we will highlight the improvements offered by this novel approach:
- It is the first earnest invitation to C++ programmers to enjoy seamless persistence a là Python
pickle
. - With a custom data-pipeline built around the latest HDF5 optimized I/O calls, it can deliver near hardware-level performance using code that is generated “mechanically” rather than custom written by an expert.
- Come prepared for peak performance fireworks throughout! [fn:1]
[fn:1] Spark-lers are a different matter: “No person shall store, sell, offer or expose for sale, or have in possession…any sparklers” (Delaware code, title 16, section 6901)
h5cpp Wrapper
Martin Shetty and Eugen Wintersberger
https://github.com/ess-dmsc/h5cpp
Abstract
We believe that a well-designed library should allow the programmer to focus on the goals of what their code tries to achieve rather than the peculiarities of the implementation they are using. With our wrapper we have sought to make things intuitive for a user of modern C++—ensuring RAII, providing scoped enums, interators and other facilities for integration with STL features. While the current abstraction covers the C-API quite broadly, we also allow the user to easily access the underlying C constructs and use any less common HDF5 features at no additional cost. We demonstrate the benefits of using this library in a number of contexts.
We present the h5cpp wrapper as:
- easy to build and use
- being functionally equivalent to the HDF5 C-API
- simple resource management (constructors and destructors)
- IO performance that can compete with the native C-API
- STL compliant interface allowing seamless integration with STL algorithms
- supports C++11 and higher
- thoroughly tested
We support and continuously test on the most popular compilers for all major platforms:
- Linux
- Windows
- OSX
Though the project is still quite young (started in March 2017) the code is currently used in active projects by
- ESS (https://europeanspallationsource.se/)
- DESY (https://www.desy.de/)
- ISIS (https://www.isis.stfc.ac.uk/Pages/home.aspx#)
- PSI (https://www.psi.ch/)
Core development and maintenance is done by DESY and ESS staff ensuring long term support for the library.
Ntuple: Tabular Data in HDF5 with C++
Chris Green and Marc Paterno
https://bitbucket.org/fnalscdcomputationalscience/hep_hpc
Abstract
We describe a C++ template-based system for the writing of tabular HDF5 data. Tables are represented by HDF5 groups, and columns are represented by extensible datasets which are fixed size arrays of basic types. Row-oriented table insertions are buffered for efficiency and written periodically to the HDF5 file. The Ntuple system is simple enough to define a table in a single call, yet flexible enough to allow per-dataset filter configuration, thread-safe insertion of multiple tables into a single user-opened file, and full user-access to the datasets representing the columnar data via thin-wrapper classes representing HDF5 entities.