Elena Pourmal, The HDF Group
UPDATE January 19, 2016: The HDF5-1.10.0-alpha1 release is now available, adding Collective Metadata I/O to these features:– Concurrent Access to an HDF5 File: Single Writer / Multiple Reader (SWMR) |
We’re pleased to announce the release of HDF5 1.10.0-alpha0.
HDF5 1.10.0, planned for release in Spring, 2016, is a major release containing many new features. On January 6, 2016 we announced the release of the first alpha version of the software.
The alpha0 release contains some (but not all) of the features that will be in HDF5 1.10.0. The Single Writer/Multiple Reader and Virtual Data Set features, below, are both contained in this alpha release as are scalable chunk indexing and persistent free file space tracking. More features, such as enhancements to parallel HDF5 and support for compressing contiguous datasets will be added in upcoming alpha releases.
Here are the highlights:
Concurrent Access to an HDF5 File: Single Writer/Multiple Read (SWMR)
Data acquisition and computer modeling systems often need to analyze and visualize data while it is being written. It is not unusual, for example, for an application to produce results in the middle of a run that suggest some basic parameters be changed, sensors be adjusted, or the run be scrapped entirely.
To enable users to check on such systems, we have been developing a concurrent read/write file access pattern we call SWMR (pronounced swimmer). SWMR functionality allows a writer process to add data to a file while multiple reader processes read from the file.
Virtual Dataset (VDS)
With a growing amount of data in HDF5, the need has emerged to access data stored across multiple HDF5 files using standard HDF5 objects, such as groups and datasets. The new virtual dataset feature (VDS) enables an application to draw on multiple datasets and files to create virtual datasets without moving or rewriting any data.
For example, an X-ray image may be stored across different HDF5 datasets in multiple HDF5 files. With VDS, the whole image may be accessed by an application without any specific knowledge of where data for each part of the image is stored.
Fine-tuning the Metadata Cache
The orderly operation of the metadata cache is crucial to SWMR functioning. A number of APIs have been developed to handle the requests from writer and reader processes and to give applications the control of the metadata cache they might need. These metadata cache APIs can also be used when SWMR is not being used.
Scalable Chunk Indexing
The indexing structure for chunked datasets has been changed to significantly improve performance.
Persistent Free File Space Tracking
Usage patterns when working with an HDF5 file sometimes result in wasted space within the file. This can also impair access times when working with the resulting files. The new file space management feature provides strategies for managing space in a file to improve performance in both of these arenas.
We encourage our users to give it a try – we would greatly appreciate any feedback you might have regarding this alpha0 release.
Please send your comments to the HDF Helpdesk (help@hdfgroup.org). If sending a bug report, please include a C example program that we can use to reproduce the issue.
PLEASE BE AWARE that the file format in this first alpha release is not yet stable. Until further notice, DO NOT keep files.
For more information on the currently available features, see the “User, Reference, and Design Documentation.”
* * * * *
The HDF5 data model, file format, API, library, and tools are open and distributed without charge.
Building on its 28-year history, The HDF Group offers personalized consulting, training, design, software development, and support services to help clients take full advantage of HDF5 capabilities in addressing their unique data management challenges. To discuss these services, please contact the HDF Group Help Desk (help@hdfgroup.org).
The newest HDF5-1.10.0-alpha release can be obtained from The HDF Group Downloads page: https://www.hdfgroup.org/downloads/
It can also be obtained directly from: https://support.hdfgroup.org/HDF5/release/obtain5110.html
Documentation can be found in: https://support.hdfgroup.org/HDF5/docNewFeatures/
Thanks for the updates.
Is the HDF5 on-disk format changed in version 1.10? Will it be possible to access files saved with 1.10 using the old library version 1.8.x?
Hi Antonino,
If SWMR or VDS is used, the 1.8 library will not be able to access the files since those two features require extensions to the HDF5 File Format that were not available in 1.8.
One can use tools we provide to convert the files:
– When the VDS feature is used,
h5repack
should be used to convert the file to 1.8 format.– When the SWMR feature is used, one can run
h5repack
, or betterh5format_convert
to convert 1.10.0 file to 1.8 file.h5repack
rewrites the whole file, whileh5format_convert
does conversion in place. It rewrites only HDF5 metadata in 1.8 format.Let us know what you think!
Elena
UPDATE Tuesday, January 19, 2016: The HDF5-1.10.0-alpha1 release is now available, adding Collective Metadata I/O to these new features:
– Concurrent Access to an HDF5 File: Single Writer / Multiple Reader (SWMR)
– Virtual Dataset (VDS)
– Scalable Chunk Indexing
– Persistent Free Filespace Tracking
See the blog update for details, and let us know what you think!