Using DAP Clients to Visualize HDF-EOS5 Grid Data

Introduction

We have tested the HDF5 handler with several OPeNDAP clients that can be used to visualize Earth Science data. Unfortunately most of them failed to display any images. After substantial in-depth investigations, we found that the reason is mainly because these clients follow CF convention and DAP doesn't necessarily follow CF convention. When we first implemented the HDF5 handler, we only preserved the geo-location information by mapping EOS Grid to DAP Grid. We didn't try to follow CF convention. After finding the reason, we then added a configuration option in the handler to support CF conventions partially. We hope this will provide a way to make several clients work. This CF option sacrifices some features provided by the default HDF5 handler but it can be convenient when one wants to visualize data with existing OPeNDAP clients. We've tested our CF configuration option with the following clients on NASA Aura HDF-EOS5 L3G data: One goal of this experiment is to provide information for existing OPeNDAP clients to better serve DAP data that strictly follow DAP protocol; especially we hope that existing OPeNDAP clients can handle DAP Grid data without additional requirements such as CF conventions.

CF Convention Problems

In general, OPeNDAP has no restriction in the length of variable name and the formats in attributes and grid data structure. However, some OPeNDAP visualization clients follow strictly CF conventions such as:

Proposed Limited Solution

We provide --enable-cf configuration option. When you build hdf5_handler, enable both --enable-cf and --enable-short-path. This will affect the behavior of hdf5_handler in the following ways:
  1. It won't generate HDF5 group attribute. (e.g. the group /HDFEOS/ in Figure 1a doesn't appear in Figure 1b.)
  2. The DAP variable name will be the relative path of the object to its immediate parent group. (e.g. The Grid variable /HDFEOS/GRIDS/ColumnAmountO3/Data%20Fields/CloudFraction in Figure 2a becomes CloudFraction as shown in Figure 2b.)
  3. It won't display a variable of which the length of the name is longer than 128 characters in DAS and DDS.
  4. We will generate independent shared dimension variables outside of Grid. (e.g. Float32 lat and Float32 lon in Figure 2b which do not appear in Figure 2a.)
  5. Some attributes and variables are renamed (e.g. XDim into lon) and inserted (i.e.Conventions "COARDS, GrADS").
  6. Minimum, maximum and resolution attribute values for lat/lon are obtained by parsing StructMetadata attribute.

Figure 1a: Default HDF5 Handler DAS Output

Figure 1b: CF option enabled HDF5 Handler DAS Output

Figure 2a: Default HDF5 Handler DDS Output

Figure 2b: CF option enabled HDF5 Handler DDS Output

Detailed Behaviors of Each Client after Applying Our Solution

The following is our current experience with the OPeNDAP clients.

Ferret

Ferret can display data with small modification of the existing handler. It requires the grid map data to fall in the valid range of geo-spatial coordinates. For example, the values of the first map data should range from -180 to 180 (longitude) and the second map data should range from -90 to 90 (latitude). It seems that it doesn't require units in DAS attributes or shared dimensions in some cases although providing them could display more dataset and add correct "latitude" and "longitude" labels.

ncBrowse

ncBrowse looks more flexible than Ferret in its behavior. As long as there are some shared dimension variables in DDS, it can display Grids. It has no restriction in variable lengths and unit specification in DAS doesn't matter. It's also great in a sense that those variables can be mapped arbitrarily to X and Y coordinates. For example, you can swap x and y coordinates to get a rotated image.

IDV

IDV is very sensitive to attributes (i.e. DAS information) returned by OPeNDAP. It seems to parse DAS information first and then retrieves DDS information from an OPeNDAP server. If a variable doesn't show up in DAS, the IDV GUI client doesn't display the variable even if DDS has it.

IDV doesn't accept a Byte attribute in DAS that has negative number and throws an error. For example, if an attribute has something like Byte _FillValue -1, IDV stops there.

It requires shared dimension variables defined outside Grid and precise unit/value pair attributes associated with them. For example, units "degrees_north", "degrees_east" and "level" must be provided (with quote included) to display data properly.

ODC

Although ODC could retrieve data successfully, it could not display data graphically until array dimensions are properly named with lat and long.

GrADS

Like ODC, GrADS also required to have named dimensions in Array within a Grid datatype.

GrADS has a special problem with the length of variable name when OPeNDAP Grids are read with sdfopen command. If a variable name is longer than 15 characters, GrADS will drop the rest of the variable name after 16-th character. For example, if an HDF-EOS5 file has both EffectiveTemperature and EffectiveTemperaturePrecision variables, they will be referred as effectivetemper in GrADS. The below is the GrADS output that illustrates the 15-character limit problem:

ga-> sdfopen http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5 Scanning self-describing file: http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5 ga-> query file File 1 : NASA EOS Aura Grid Descriptor: http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5 Binary: http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5 Type = Gridded Xsize = 1440 Ysize = 720 Zsize = 1 Tsize = 1 Number of Variables = 12 cloudfraction 0 -999 cloudfractionpr 0 -999 cloudpressure 0 -999 cloudpressurepr 0 -999 columnamounto3 0 -999 columnamounto3p 0 -999 effectivetemper 0 -999 effectivetemper 0 -999 ghostcolumnamou 0 -999 solarzenithangl 0 -999 terrainheight 0 -999 terrainreflecti 0 -999 ga->

Here's the summary of OPeNDAP visualization clients:
ClientProblemSolutionNotes
ncBrowseN/AN/A2D display only. Axes are array index.
FerretGrid map data should fit into certain range(-180~180/-90~90).Generate correct Grid map data.Attributes with units should be provided for correct display.
IDVNo negative value in Byte attributeConvert signed value into unsigned value.Beautiful user interface.
GrADS15 character limit in variable namesRenaming grid map data variables and named dimensions in Array. 
ODCN/AN/ADisplay requires GrADS solution.

Limitations and Potential Problems

Shortening variable and dimension names is not desirable in HDF5 handler since it can cause ambiguity by losing group information. For example, if there are two or more "XDim" variables(e.g. /GRID1/XDim and /GRID2/XDim) that have different sizes(e.g. size(/GRID1/XDim)=10 and size(/GRID1/XDim)=20), clients that depend on the shared dimension will not work because CF-option-enabled HDF5 handler will treat them as the same variable.

Providing "units" information requires two pass algorithms for handler. The data provider should know what variables should have units attached. If a variable has already "units" attribute, that could be a potential problem too. Currently, units attributes are manually added.

Ferret is flexible in terms of variable name and length but this requires generating Grid map data that fit into certain ranges (i.e. -180~180 / -90~90). This challenges the true integrity of data if a client is not informed. This has been OK with EOS so far since map data is artificially generated from parsing "StructMetdata.0" information.

Conclusion

Although many clients claim that they support OPeNDAP, they do not support the full capability of OPeNDAP that can provide. We suspect that they mainly rely on NetCDF client library provided by OPeNDAP which is targeted for supporting NetCDF format over Internet. To fully support all capabilities of HDF5 such as group structure or long names, it is important to develop HDF5-friendly OPeNDAP client library.