The HDF Group is pleased to announce that we are actively developing an HDF5 Connector for Apache Spark™ and are seeking Beta Users for this software.
The HDF5 Spark Connector allows users of the Apache Spark™ open source processing engine to natively query data stored in HDF5 files.
This software is being developed in response to interest from members of the HDF5 user community. Many of them are interested in using Spark to obtain the same kind of speed, scalability, and reliability in data processing that they look for in I/O from HDF5. To date, they have been hampered by Spark’s inability to directly access HDF5 files. Without this software, as a workaround, they have had to first perform an unwanted conversion of existing data from HDF5 to another data storage tool that Spark can directly read. We consider this software to be an exciting bridge between two very different but important and influential open source big data technologies:
The HDF Group is eager to recruit HDF5 users who are interested in joining the Beta Test program for the HDF5 Spark Connector. As a Beta Tester, you will have an opportunity to begin using this software and in the next few months provide crucial feedback to The HDF Group that will help guide the functionality and roadmap for this product. If you would like to be a beta tester, please sign up here. Please let us know if you have questions, or would just like to stay updated on this product.