HSDS Docker Images
by John Readey, The HDF Group
The Highly Scalable Data Service (HSDS) runs as a set of containers in Docker (or pods in Kubernetes) and like all things Docker, each container instance is created based on a container image file (using the OCI format specification, see: https://github.com/opencontainers/image-spec.) Unlike say, a library binary, the container image includes all the dependent libraries needed for the container to run. To create a container image, you can use the “docker build” command which follows instructions in the Dockerfile to create the image. Each image is identified by a repository name (e.g. “hdfgroup/hsds”) and a tag (e.g. “latest”), separated by a colon. E.g.: hdfgroup/hsds:latest. Tags are typically used to distinguish different versions of the container software.
Rather than building the container image for HSDS yourself, it’s even easier to just fetch an image from Docker Hub. Docker Hub is a popular repository for container images, and you can find different versions of the HSDS container image here. When a specific container image tag is needed, Docker will first check to see if it has already been downloaded, and if not present, download from the requested repository (Docker Hub if not otherwise specified).
Here’s how the process works when the
runall.sh script is invoked to start HSDS:
runall checks for the existents of various environment variables to determine if this is an AWS, Azure, or Posix deployment. For example, if
AWS_S3_GATEWAYis found, it assumes this is for an AWS Deployment using S3 storage.
The command “
docker-compose -f <deployment_file> up -d” is executed where <deployment_file> is one of the deployment yaml’s provided here. The deployment file specified which containers are to be launched and how they will be networked together. (This enables the SN container to find the DN containers for example.)
Docker uses the image key in the deployment yaml to fetch the indicated container image (if not already present). If the image key doesn’t include a tag, the tag “latest” will be fetched.
Containers are started using the container image plus whatever settings are given in the compose file.
The runall.sh script is just a convenience method to make the initial deployment of HSDS as easy as possible. Feel free to customize the deployment yaml in any way you see fit, and/or to just invoke docker-compose directly rather than use the runall script.
One change you may wish to consider is to use a specific tag for the image. While you may think the default “latest” tag would give you the most recent image in the repository, actually it treats “latest” as it would any other tag. In any case for production use, you likely don’t want to get surprised by a new version of HSDS suddenly showing up. Best practice would be to set the tag to one of the specific tags in the repo. Update the tag periodically to get up to date with the latest stable release. The tags starting with “sha-” are based on the SHA value of the corresponding commit to the GitHub master branch. Once a check in to the GitHub master branch has passed all the continuous integration checks, it is automatically pushed to DockerHub. If you use one of these tags in your compose file, you are guaranteed the image will not change in the future. Conversely, if you use the tag “master”, you’ll get the most recent master image (and you may or may not get a different set of bits when you next deploy HSDS).
By the way, there’s no requirement to use DockerHub. If you are running on AWS it’s equally convenient to use the “Elastic Container Registry.” Or if you are running on Azure, you can use the Azure Container Registry. For these options, you’ll need to push the HSDS container images yourself.
Finally, if you are deploying on Kubernetes, you’ll use the same image tags and repositories as with Docker. You’ll specify the image tags in the deployment yaml that you’ll create based on one of the examples provided here.
I hope this helps answer any questions you may have about container images. Let us know if this is useful or raises additional questions!