big data Tag

By Francesc Alted. He is a freelance consultant and developing author of different open source libraries like PyTables, Blosc, bcolz and numexpr and an experienced programmer in Python and C. Francesc collaborates regularly with the The HDF Group in different projects. We explain our solution for handling big data streams using HDF5 (with a little help from other tools). In the ubiquitously connected world that we live in, there are good reasons to understand what data is transferred across a network and how to extract information out of it. Being able to capture and log the different network packets can be used for many tasks including: Protecting against cyber threats Enforcing policy Extracting and consolidating valuable information Debugging protocols/services Understanding how users use...

Pearah joins The HDF Group as new Chief Executive Officer

Champaign, IL — The HDF Group today announced that its Board of Directors has appointed David Pearah as its new Chief Executive Officer. The HDF Group is a software company dedicated to creating high performance computing technology to address many of today’s Big Data challenges.

Pearah replaces Mike Folk upon his retirement after ten years as company President and Board Chair. Folk will remain a member of the Board of Directors, and Pearah will become the company’s Chairman of the Board of Directors.

Pearah said, “I am honored to have been selected as The HDF Group’s next CEO. It is a privilege to be part of an organization with a nearly 30-year history of delivering innovative technology to meet the Big Data demands of commercial industry, scientific research and governmental clients.”

Industry leaders in fields from aerospace and biomedicine to finance join the company’s client list.  In addition, government entities such as the Department of Energy and NASA, numerous research facilities, and scientists in disciplines from climate study to astrophysics depend on HDF technologies.

Pearah continued, “We are an organization led by a mission to make a positive impact on everyone we engage, whether they are individuals using our open-source software, or organizations who rely on our talented team of scientists and engineers as trusted partners. I will do my best to serve the HDF community by enabling our team to fulfill their passion to make a difference.  We’ve just delivered a major release of HDF5 with many additional powerful features, and we’re very excited about several innovative new products that we’ll soon be making available to our user community.”

“Dave is clearly the leader for HDF’s future, and

We are excited and pleased to announce HDF5-1.10.0, the most powerful version of our flagship software ever.> This major new release of HDF5 is more powerful than ever before and packed with new capabilities that address important data challenges faced by our user community. HDF5 1.10.0 contains many important new features and changes, including those listed below. The features marked with * use new extensions to the HDF5 file format. The Single-Writer / Multiple-Reader or SWMR feature enables users to read data while concurrently writing it. * The virtual dataset (VDS) feature enables users to access data in a collection of HDF5 files as a single HDF5 dataset and to use the HDF5 APIs to work with that dataset. *   (NOTE:...

David Dotson, doctoral student, Center for Biological Physics, Arizona State University; HDF Guest Blogger

Recently I had the pleasure of meeting Anthony Scopatz for the first time at SciPy 2015, and we talked shop. I was interested in his opinions on MDSynthesis, a Python package our lab has designed to help manage the complexity of raw and derived data sets from molecular dynamics simulations, about which I was

Mohamad Chaarawi, The HDF Group

Second in a series: Parallel HDF5

NERSC’s Cray Sonexion system provides data storage for its Mendel scientific computing cluster.

In my previous blog post, I discussed the need for parallel I/O and a few paradigms for doing parallel I/O from applications. HDF5 is an I/O middleware library that supports (or will support in the near future) most of the I/O paradigms we talked about.

In this blog post I will discuss how to use HDF5 to implement some of the parallel I/O methods and some of the ongoing research to support new I/O paradigms. I will not discuss pros and cons of each method since we discussed those in the previous blog post.

But before getting on with how HDF5 supports parallel I/O, let’s address a question that comes up often, which is,

“Why do I need Parallel HDF5 when the MPI standard already provides an interface for doing I/O?”

Lindsay Powers - The HDF Group The HDF Group provides free, open-source software that is widely used in government, academia and industry. The goal of The HDF Group is to ensure the sustainable development of HDF (Hierarchical Data Format) technologies and the ongoing accessibility of HDF-stored data because users and organizations have mission-critical systems and archives relying on these technologies. These users and organizations are a critical element of the HDF community and an important source of new and innovative uses of, and sustainability for, the HDF platforms, libraries and tools. We want to create a sustainability model for the open access platforms and libraries that can serve these diverse communities in the future use and preservation of their data. As a...

  • 1