hdf images hdf images

This web site is no longer maintained (but will remain online).
Please see The HDF Group's new Support Portal for the latest information.

PSH5X - A Windows PowerShell Module for HDF5

HDF Group - PSH5X PSH5X logo

PSH5X is a Windows PowerShell module for HDF5. It leverages PowerShell's provider model to produce a file system-like experience for HDF5 (an often cited metaphor). PSH5X helps you perform simple housekeeping tasks such as renaming HDF5 links or copying HDF5 objects, but it can also create new HDF5 items (HDF5 objects, links, attributes) and read or write HDF5 dataset and attribute values. Did you ever ask questions similar to the following?

You'll find that these are examples of the proverbial 'one-liners' in PSH5X.

After years of uncontrolled growth of a bewildering jungle of scripting technologies on the Windows platform, there's, finally, a one-stop automation hub, Windows PowerShell. You may not be aware of it, but it comes with every modern Windows desktop or server installation. You can view PSH5X as a ramp leading straight into the fast lane on the PowerShell highway. There you will have access to a myriad of helpful cmdlets to get almost every HDF5 job done. For example, have a look at FAQ 2.01 if you ever wondered how to get data from HDF5 into Excel.

There are already several excellent scripting interfaces available for HDF5 including Andrew Collette's h5py Python module. Most people would probably agree that for something as wonderful and multi-faceted as HDF5 there can hardly be too many good choices. With PSH5X we're adding another powerful tool to the arsenal and hope that, with your help, it will find its "niche" in the ecosystem.

Questions? Watch the PSH5X movie and check out a few remarks on terminology, a list of PSH5X cmdlets, the FAQ, the tutorial, advanced features, and several limitations and known issues.

Please send your questions/comments/suggestions to the HDF forum or contact us by email.

Prerequisites

Operating system: Windows XP, Vista, 7, 8, Windows Server 2003, 2008, 2008 R2, 8
Runtime: Visual C++ 2010 Redistributable Package x86 or x86_64
.NET Framework: Version 4 or higher; earlier versions are currently not supported.
PowerShell: Version 2.0 or higher, Version 1.0 is not supported.

Downloads

Installation

  1. Unpack the zip archive. Create a directory called HDF5 folder in a directory in your PowerShell module path $ENV:PSModulePath. For example, on Windows 7 you'd most likely create it in C:\Users\{your username}\Documents\WindowsPowerShell\Modules. Copy the contents of the module folder and the PSH5X.dll for your architecture (Win32 or Win64) from the zip archive into the HDF5 folder you just created. Alternatively, run the Install-PSH5X.ps1 script that performs these steps for you. PSH5X.dll depends on the Visual C++ 2010 runtime via msvcp100.dll and msvcr100.dll which are included in the zip archive. Unless they are already in your PATH, please copy them (for your architecture) to a location in your PATH.
  2. If you are running PowerShell version 2.0, make sure that it runs with .NET 4.0 (and not the default .NET 2.0). The easiest way to accomplish this is to copy the files powershell.exe.config and (optionally) powershell_ise.exe.config from the zip archive to $PSHOME, typically C:\Windows\System32\WindowsPowerShell\v1.0. A braoder discussion can be found in this blog entry and this forum post.
  3. At this point, you may try running Import-Module HDF5. There's a chance that instead of a cheerful "Welcome to HDF5!" greeting you'll see an error message telling you that ... cannot be loaded because the execution of scripts is disabled on this system. By default, PowerShell can't run any scripts. If and what kind of scripts you can execute is governed by a so-called execution policy. Your system administrator may have set that or it still has the default value of Restricted, which is a euphemism for 'no scripts'. The Set-ExecutionPolicy cmdlet lets you adjust what kind of scripts are allowed to run. Running Set-ExecutionPolicy without a -Scope argument will change the registry and requires administrative privileges. Your options are:
    • You have administrative rights on the machine: Run PowerShell as administrator and adjust the execution policy as follows Set-ExecutionPolicy RemoteSigned
    • You don't have administrative rights on the machine: Adjust the execution policy for just this session as follows Set-ExecutionPolicy -Scope Process RemoteSigned

If you did everything right, when importing the HDF5 module, you should see something like this:

PowerShell Window

Each session has its own sandbox HDF5 drive named h5tmp. The full path to the HDF5 file backing this drive is stored in the PSH5XTmpFile environment variable. (The file does not have an .h5 extension, but it is an HDF5 file. Trust me!) This file remains open and writeable as long as the drive is mapped. You can close it either via Remove-H5Drive h5tmp or by closing the session (window). PSH5X does not delete the sandbox file! (in case you'd like to recover some of the objects later...)

PowerShell Resources

Windows PowerShell: Learn It Now Before It's an Emergency
Bruce Payette: Windows PowerShell in Action , 2nd Ed. Manning 2011.
Must read!
Arul Kumaravel, Jon White, Michael Naixin Li, Scott Happell, Guohui Xie, Krishna C. Vutukuri: Professional Windows PowerShell Programming: Snapins, Cmdlets, Hosts and Providers. Wrox 2008.
Must read, if you want to make sense of the source code.
Windows PowerShell: Origin and Future
Bruce Payette and Jeffrey Snover discuss PowerShell
Windows PowerShell Blog
Automating the world one-liner at a time...
PowerShell Community Extensions
PowerShell Community Extensions (PSCX) is aimed at providing a widely useful set of additional cmdlets, providers, aliases, filters, functions and scripts for Windows PowerShell that members of the community have expressed interest in.

An Appetizer

PSH5X manages HDF5 files in the form of drives that one can assign drive names such as h5 or aura. Such drives can be dynamically (un-)mounted in a PowerShell session. Below are a few sample commands that might give you an idea of what it feels like to work in PowerShell/PSH5X. (Sample file download)

# load the PSH5X module into a PowerShell session (not necessary in PS v.3)
Import-Module HDF5

# create a drive called 'aura' backed by an HDF5 file 
New-H5Drive aura C:\h5\HIRDLS-Aura_L2_v02-04-09-c03_2008d001.he5

# make the HDF5 root of 'aura' our "working directory"
cd aura:

# How many HDF5 groups and datasets are there in this file?
dir . -Recurse -Filter dg | %{ switch($_.ItemType) `
    { 'Group' {$groups++} 'Dataset' {$datasets++} } };  `
    "Groups: $groups - Datsets: $datasets"
    
# list all HDF5 path names to datasets that contain 'H2O'
dir . -r -Filter d | ?{$_.PSChildName -like '*H2O*'} | select PSPath    
    
# How much data is stored in HDF5 datasets vs. the total file size?
dir . -Recurse -Filter d | %{ $dsetBytes += $_.StorageSizeBytes }; `
    "In datasets: $dsetBytes - File: $((Get-H5Drive aura).FileSizeBytes)"    
 
# switch to the HDF5 sandbox drive 'h5tmp'
cd h5tmp:

# create a new HDF5 group 'copy of aura' and populate it with HDF5 objetcs from 'aura:'
mkdir 'copy of aura'; Copy-H5Item aura:\* 'copy of aura' -Recurse

# remove the 'aura' drive
Remove-H5Drive aura

# create a chunked, compressed, extendible dataset and force intermediate group creation
New-H5Dataset '/A/B/My Doubles' double 10,20,40 -MaxDim -1,50,-1 -Chunk 10,10,10 -Gzip 6 -Force

# create a .NET array and set the value of '/A/B/My Doubles' (value initialization not shown)
$value = New-H5Object 'double[,,]' 10,20,30
...
Set-H5DatasetValue '/A\B/My Doubles' $value
        

Questions? Check out a few remarks on terminology, a list of PSH5X cmdlets, the FAQ, the tutorial, advanced features, and several limitations and known issues.

- - Last modified: 31 January 2017