Using DAP Clients to Visualize HDF-EOS5 Grid Data
Introduction
We have tested the HDF5 handler with several OPeNDAP clients that can be used to visualize Earth Science data. Unfortunately most of them failed to display any images. After substantial in-depth investigations, we found that the reason is mainly because these clients follow CF convention and DAP doesn't necessarily follow CF convention. When we first implemented the HDF5 handler, we only preserved the geo-location information by mapping EOS Grid to DAP Grid. We didn't try to follow CF convention. After finding the reason, we then added a configuration option in the handler to support CF conventions partially. We hope this will provide a way to make several clients work. This CF option sacrifices some features provided by the default HDF5 handler but it can be convenient when one wants to visualize data with existing OPeNDAP clients. We've tested our CF configuration option with the following clients on NASA Aura HDF-EOS5 L3G data:
One goal of this experiment is to provide information for existing OPeNDAP clients to better serve DAP data that strictly follow DAP protocol; especially we hope that existing OPeNDAP clients can handle DAP Grid data without additional requirements such as CF conventions.
CF Convention Problems
In general, OPeNDAP has no restriction in the length of variable name and the formats in attributes and grid data structure.
However, some OPeNDAP visualization clients follow strictly CF conventions such as:
- Shared dimensions must be defined and available outside Grid data-type.
- Shared dimensions must have pre-defined units in attributes.
- Array within a Grid type should have a name for each dimension.
Proposed Limited Solution
We provide --enable-cf configuration option. When you build hdf5_handler, enable both --enable-cf and --enable-short-path.
This will affect the behavior of hdf5_handler in the following ways:
- It won't generate HDF5 group attribute.
(e.g. the group /HDFEOS/ in Figure 1a doesn't appear in Figure 1b.)
- The DAP variable name will be the relative path of the object to its immediate parent group.
(e.g. The Grid variable /HDFEOS/GRIDS/ColumnAmountO3/Data%20Fields/CloudFraction in Figure 2a becomes CloudFraction as shown in Figure 2b.)
- It won't display a variable of which the length of the name is longer than 128 characters in DAS and DDS.
- We will generate independent shared dimension variables outside of Grid. (e.g. Float32 lat and Float32 lon in Figure 2b which do not appear in Figure 2a.)
- Some attributes and variables are renamed (e.g. XDim into lon) and inserted (i.e.Conventions "COARDS, GrADS").
- Minimum, maximum and resolution attribute values for lat/lon are obtained by parsing StructMetadata attribute.
Figure 1a: Default HDF5 Handler DAS Output
Figure 1b: CF option enabled HDF5 Handler DAS Output
Figure 2a: Default HDF5 Handler DDS Output
Figure 2b: CF option enabled HDF5 Handler DDS Output
Detailed Behaviors of Each Client after Applying Our Solution
The following is our current experience with the OPeNDAP clients.
Ferret
Ferret can display data with small modification of the existing handler.
It requires the grid map data to fall in the valid range of geo-spatial coordinates.
For example, the values of the first map data should range from -180 to 180 (longitude) and the second map data should range from -90 to 90 (latitude).
It seems that it doesn't require units in DAS attributes or shared dimensions in some cases although providing them could display more dataset and add correct "latitude" and "longitude" labels.
ncBrowse
ncBrowse looks more flexible than Ferret in its behavior.
As long as there are some shared dimension variables in DDS, it can display Grids.
It has no restriction in variable lengths and unit specification in DAS doesn't matter.
It's also great in a sense that those variables can be mapped arbitrarily to X and Y coordinates.
For example, you can swap x and y coordinates to get a rotated image.
IDV
IDV is very sensitive to attributes (i.e. DAS information) returned by OPeNDAP.
It seems to parse DAS information first and then retrieves DDS information from an OPeNDAP server.
If a variable doesn't show up in DAS, the IDV GUI client doesn't display the variable
even if DDS has it.
IDV doesn't accept a Byte attribute in DAS that has negative number and throws an error.
For example, if an attribute has something like Byte _FillValue -1, IDV stops there.
It requires shared dimension variables defined outside Grid and precise unit/value pair attributes associated with them.
For example, units "degrees_north", "degrees_east" and "level" must be provided (with quote included)
to display data properly.
ODC
Although ODC could retrieve data successfully, it could not display data graphically until array dimensions are properly named with lat and long.
GrADS
Like ODC, GrADS also required to have named dimensions in Array within a Grid datatype.
GrADS has a special problem with the length of variable name when OPeNDAP Grids are read with sdfopen command.
If a variable name is longer than 15 characters, GrADS will drop the rest of the variable name after 16-th character.
For example, if an HDF-EOS5 file has both EffectiveTemperature and EffectiveTemperaturePrecision variables, they will be referred as effectivetemper in GrADS. The below is the GrADS output that illustrates the 15-character limit problem:
ga-> sdfopen http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5
Scanning self-describing file: http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5
ga-> query file
File 1 : NASA EOS Aura Grid
Descriptor: http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5
Binary: http://hdfdap.hdfgroup.uiuc.edu:8888/opendap/data/AURA_Level_3_Data/Another-OMI-L3-file.he5
Type = Gridded
Xsize = 1440 Ysize = 720 Zsize = 1 Tsize = 1
Number of Variables = 12
cloudfraction 0 -999
cloudfractionpr 0 -999
cloudpressure 0 -999
cloudpressurepr 0 -999
columnamounto3 0 -999
columnamounto3p 0 -999
effectivetemper 0 -999
effectivetemper 0 -999
ghostcolumnamou 0 -999
solarzenithangl 0 -999
terrainheight 0 -999
terrainreflecti 0 -999
ga->
Here's the summary of OPeNDAP visualization clients:
| Client | Problem | Solution | Notes |
| ncBrowse | N/A | N/A | 2D display only. Axes are array index. |
| Ferret | Grid map data should fit into certain range(-180~180/-90~90). | Generate correct Grid map data. | Attributes with units should be provided for correct display. |
| IDV | No negative value in Byte attribute | Convert signed value into unsigned value. | Beautiful user interface. |
| GrADS | 15 character limit in variable names | Renaming grid map data variables and named dimensions in Array. | |
| ODC | N/A | N/A | Display requires GrADS solution. |
Limitations and Potential Problems
Shortening variable and dimension names is not desirable in HDF5 handler since it can cause ambiguity by losing group information. For example, if there are two or more "XDim" variables(e.g. /GRID1/XDim and /GRID2/XDim) that have different sizes(e.g. size(/GRID1/XDim)=10 and size(/GRID1/XDim)=20), clients that depend on the shared dimension will not work because CF-option-enabled HDF5 handler will treat them as the same variable.
Providing "units" information requires two pass algorithms for handler. The data provider should know what variables should have units attached. If a variable has already "units" attribute, that could be a potential problem too. Currently, units attributes are manually added.
Ferret is flexible in terms of variable name and length but this requires generating Grid map data that fit into certain ranges (i.e. -180~180 / -90~90). This challenges the true integrity of data if a client is not informed. This has been OK with EOS so far since map data is artificially generated from parsing "StructMetdata.0" information.
Conclusion
Although many clients claim that they support OPeNDAP, they do not support the full
capability of OPeNDAP that can provide. We suspect that they mainly rely on
NetCDF client library provided by OPeNDAP which is targeted for supporting
NetCDF format over Internet. To fully support all capabilities of HDF5 such as
group structure or long names, it is important to develop HDF5-friendly OPeNDAP
client library.