HDF Lab

What is HDFLab?

HDFLab is a JupyterHub/JupyterLab workspace that is managed by The HDF Group. HDFLab enables members of the HDF community to run Python, Julia, and R codes in the cloud just using their browser.

When you sign into HDF Lab, you have access to a compute environment that is setup for you to start using HDF immediately—no need to install software, download data, or otherwise wrestle with your system (see: https://xkcd.com/1987/). In HDF Lab you have access to the equivalent of a powerful desktop computer running in the cloud that is available whenever you need it.

Key aspects of HDF Lab are the JupyterLab interface, pre-installed software, and access to the cloud:

  • JupyterLab interface—The Jupyter “notebook” interface allows you to combine text, images, and code in one environment. It’s a great place for experimental coding or data exploration—it’s easy to run a snippet of code, see the result, and modify as necessary. You also have access to a terminal to run regular command line applications.
  • Pre-installed software—HDF Lab comes with all the libraries and packages needed to work with HDF—either using the HDF5 Library or Highly Scalable Data Service (HSDS). If you are just learning about HDF, it’s a great place to get started—HDF Lab provides a set of tutorials to get you up to speed on using HDF.
  • Access to the cloud—Since with HDF Lab, you are already “in” the cloud, you’ll have a much easier time accessing cloud-based data files. For most users downloading data files to their HDF Lab environment will be much faster than downloading to their desktop computer or laptop. Several interesting datasets are accessible via HSDS, including Petabytes of data provided by the National Renewable Energy Lab (NREL). Other cloud-based data will be easily accessible. See https://registry.opendata.aws/ for a list of public data hosted on AWS. In addition HDF Lab is configured with access to a co-located instance of Highly Scalable Data Service (HSDS); more about that below.

HDF Lab access starts off with a 30-day free trial, and thereafter is just $10/month or $100/year. This gives you access to 8GB RAM for running your Jupyter Notebooks, 10GB local disk, and 200GB of storage on HDF Server.

Why Use HDF Lab?

Benefit Details
Convenience Computer environment is setup with all software needed to start using HDF. immediately
Learn HDF Tutorial and examples enable you to quickly develop and understanding of HDF and related tools
Cloud Access Run code in the cloud without needing to set up (and pay for) an account with a cloud provider
Data Access Petabytes of data that are available in the cloud
Experiment Try out using HDF5 and HSDS in HDF Lab. If it’s suitable for your application, but you need more capacity then is available with HDF Lab, you can setup a similar environment in your own cloud account or on-premise system
Performance Harness power of a computer cluster using HSDS
Scalability Any number of users can use HDF Lab simultaneously (let us know if you plan on hosting a class on HDF Lab so we can properly scale the system)
Sharing Share HDF data with other HDF Lab users (you control exactly what and who you share with)
Cost It’s free!

How does HDF Lab work?

HDF Lab runs as a set of components (pods) on a Kubernetes cluster in AWS. When a user signs in to HDF Lab, they are authenticated using their HDF Group credentials, then a new pod is spun up that will host their virtual computing environment. Each user pod is linked with a virtual disk drive of 10 GB that can be used to store notebooks, code, or data files. Any information you store on the drive will be available to you next time you log in.

In addition, you will have access to the HSDS service (which itself is running as a set of pods). HSDS enables high performance read/write access to content stored on AWS S3. Since your compute environment, HSDS, and S3 are all located in the same AWS Region, and share a high-speed network, you get much better performance compared with accessing cloud data from your desktop computer.

On HSDS there are example data files under “/shared/” that all HDF Lab users have access too. In addition, the folder /home/<username>/ will be available for you to host whatever data you like—up to 200 GB.

FAQ

Visit the HDF Lab FAQs on our support site.

Getting started:

  1. If you have not previously registered on this site, please start by registering. If you have registered before, please login.
  2. Click the Start HDF Lab button.
  3. Review and accept the license agreement.
  4. You can now start your free trial of HDF Lab.

Other Resources

Questions?

Content the Help Desk or post on question on the HSDS forum.