HDF Newsletter #6 Contents: HDF Email Group NSF Supports netCDF/HDF Project HDF 3.2 update No-transpose plans for HDF 3.2 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ HDF Email Group: hdfnews In an effort to increase communication within the HDF community we have set up an HDF email group called "hdfnews@ncsa.uiuc.edu". IF YOU ARE PART OF THE HDFNEWS GROUP, YOU SHOULD ALREADY HAVE RECEIVED A COPY OF THIS NEWSLETTER. IF NOT, AND YOU WISH TO CONTINUE RECEIVING THE NEWSLETTERS, AS WELL AS ALL OTHER HDFNEWS CORRESPONDENCE, YOU NEED TO JOIN THE GROUP. If you join the hdfnews email group, you can both send and receive hdfnews mail. Any mail you send to hdfnews@ncsa.uiuc.edu will be broadcast to all members of the group. The mail group is managed by a mail list server whose address is stglistserv@ncsa.uiuc.edu. There are a number of things you can do with the mail server. For a full list of capabilities, send email to stglistserv@ncsa.uiuc.edu, and include the word "help" (no quotes) as the first and only line in your mail message. TO JOIN THE HDFNEWS GROUP, send mail to stglistserv@ncsa.uiuc.edu with the following message: subscribe hdfnews To remove yourself from the hdfnews list, send mail to stglistserv@ncsa.uiuc.edu with the following message: unsubscribe hdfnews To get a listing of all members, send the message: recipients hdfnews We welcome your thoughts on this way of distributing HDF information. And your participation. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ NSF Supports netCDF/HDF Project We are very excited to announce that the National Science Foundation's Division of Advanced Scientific Computing, recently announced that they will support a project to combine the features of Unidata's netCDF programming interfaces and data models with NCSA's HDF file format. NetCDF provides an elegant and powerful data model that can be used with widely varying scientific data sets. This NSF support represents an enormous opportunity to unify a large segment of the scientific community. The project will have immediate benefit to scientists who use either netCDF or HDF, for it will combine important features of each into one data storage format and access library, making available to scientists a rich set of general applications for analyzing, displaying, and sharing data. The objectives of the project are (1) to adapt the HDF file structure to the netCDF data model, incorporating the netCDF interface in the HDF library, (2) to adapt NCSA visualization software to read netCDF objects stored in HDF files, (3) to adapt NCSA software to read both netCDF and HDF files, and (4) to investigate the possibility of incorporating NASA's CDF, a forerunner of netCDF, into the resultant package. Chris Houck, a member of the NCSA HDF team, is the primary developer in the project. Chris has set up an HDF-netcdf news group, which you are welcome to join and participate in. To join the HDF-netcdf news group, send email to HDF-netcdf- request@ncsa.uiuc.edu asking to be put on the list. (Note: this is different from the way "hdfnews" works. Sorry, we switched technologies.) Chris has been making real headway since he started working on it several months ago. He should have a prototype ready soon. Stay tuned... ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ HDF 3.2 Beta Update We have just put a new version of HDF 3.2 beta on the anonymous ftp server (ftp.ncsa.uiuc.edu; IP address 141.142.20.50) in the directory HDF/HDF3.2beta. Platforms supported =================== The new version has been ported and fully tested on the Sun Sparc, and almost fully tested on SGI (IRIX 4.0), Cray 2 (UNICOS), Cray Y-MP (UNICOS), IBM RS/6000 (AIX), DecStation 5000, and Convex. (The only part that hasn't been tested on the non-Sparc machines is the non- transposing SDS code. See "No-transpose" below.) The Mac and PC versions are close to being done, but have not been tested. An HP version has been created by Brian Wallace at Oak Ridge, and we will be testing it as soon as we can. We plan to work on the VMS version in the next couple of weeks. No-transpose ============ As mentioned in the last newsletter, the HDF 3.2 SDS Fortran interface will no longer transpose arrays when writing them to a file. A full description of the implications of this are described in the next article ("No-transpose Plans") in this newsletter. If you use Fortran at any stage in workingwith HDF files, you probably should read the article. VSETS ===== The most important change to the Vset interface is that the constants for data types have changed. Previously the type of data had been specified with constants such as 'LOCAL_INTTYPE'. This was a problem in two ways: 1) Not every machine's local integer is the same size, but Vsets treated them all as the same. 2) It made the Vset interface incompatible with the new data conversion routines used by all of the other interfaces. So, with this release, rather than specifying the type of data as 'LOCAL_XXXTYPE' you should now use the same type specifier as if you were storing the data through the SDS interface: DFNT_INT8, DFNT_INT16, DFNT_INT32, DFNT_FLOAT32... Vsets also now allow the modification of existing VDatas. By attaching to an existing VData with write access you are now allowed to seek (via. VSseek()) to any location and place new data (via. VSwrite()). If it is necessary to increase the size of an existing VData, rather than deleting it and rewriting it, you should call: VSappendable(VDATA *vs) You will then be able to seek or write past the previous end of the VData. You will also be permitted to read passed the end of the valid data in the VData. It is currently the application programmer's responsibility to detect and handle this case. Efficiency ========== Slight improvements have been made to the "H-Layer", the lowest level of the HDF library, these improvements should benefit the users of every interface. In addition, the efficiency of the code in the VSet interface has been improved; these changes will, obviously, only benefit people using the VSet interface. Still to do =========== The three remaining tasks we have are: * to convert the command-line utilities to work in HDF 3.2, * to redo the documentation, and * to fully test the new (non-transposing) SDS code on all machines. (It has been tested on Sun-Sparc only.) We plan to do an official release by the end of this month, even if we do not finish the documentation or all of the utilities conversions. Thanks to all you adventurous souls who have given us feedback on earlier beta versions. It has been enormously helpful. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ No-transpose Plans In newletter #5 we proposed to not transpose the data array when HDF functions are called from Fortran programs to read data from or write data to an HDF file. All the responses from users voted for no transpose, with some concerns about how to convert old HDF files written by Fortran programs and how the no-transpose interface will look. We also presented a module dfsd_notranspose.c which replaces dfsd.c to build a transposeless HDF library. With dfsd_notranspose, Fortran calling programs are required to set dimension sizes in a reverse order verses that in HDF3.1. However, since the reversal of the order of dimension sizes would require modification of old programs in order to link them with new library, we decided to change the dimension orders back to that of HDF 3.1 and let HDF 3.2 do the reversal for Fortran programs. In this way, old programs need not be changed, and the dimension orders are more natural to Fortran programmers. The following is what we plan to do: 1. HDF3.2 will no longer transpose the data arrays when reading or writing scientific data sets from Fortran programs. 2. Fortran programs call the no-transpose interface in the same way they call the 3.1 interface. For example, suppose there is a data array declared as data(3,4,5) in a Fortran program. Before calling DFSDputdata we should set up dimension information as: rank = 3 dims(1) = 3 dims(2) = 4 dims(3) = 5 and call dfsdputdata(filename, rank, dims, data). This means that dim(1) is the first dimension in the declaration of the array, which changes fastest in terms of the internal storage of an array in Fortran. Dim(3) is the last dimension and the slowest-changing dimension. Similarly, dfsdgetdata(filename, rank, bufsizes, indata) returns data starting at indata and the values in bufsizes should be the same as those in dims, bufsizes(1)=3, bufsizes(2)=4, and bufsizes(3)=5. 3. There will be no change in the C program interface either. If the data array is declared as data[100][10][12], DFSDputdata should be called with: rank = 3; dims[0] = 100; dims[1] = 10; dims[2] = 12; DFSDputdata(filename, rank, dims, data); Here, dims[0] is also the first dimension in the declaration of the array. However, in C language, the first dimension is the slowest-changing dimension in terms of internal storage of the array. 4. The data elements in the data array will be written to the HDF file in the same sequence as they are stored in memory, no matter what the programming language is. The information about how these elements are grouped together is contained in the values of dimension sizes. C and Fortran represent the information differently: the first dimension in C changes slowest while in Fortran it changes fastest. To standardize, HDF can take only one of them, C order or Fortran order. We have adopted C order. This means, in the HDF file, the first value of the dimension sizes is the size of the slowest changing dimension and the last value is the size of the fastest- changing dimension. In other words, HDF functions dfsdputdata() and dfsdgetdata() will reverse the order of dimension sizes when they are called from Fortran programs. This reversal is done by HDF. Application programmers just use the interface as shown above. No extra transposition is needed in application programs. 5. Are HDF files written by old (HDF3.1) library readable by 3.2? Yes. When DFSDgetdata or DFSDgetslice is called from Fortran programs, they will read the data according to the version of the library. If the dataset was written by 3.1 (or earlier), it will be read back in the way 3.1 does, the data will be transposed and the dimension sizes will not be reversed. If the dataset was written by 3.2, it will be read back in 3.2 method: the data will not be transposed while the dimension sizes will be reversed as appropriate. When DFSDgetdata() or DFSDgetslice() is called from C programs, they do the same as 3.1 in C. 6. Are HDF files written by new (3.2) library readable by 3.1? If the dataset is of 32-bit-floating-point number type, it can be read by 3.1. However, the dimension sizes in the file are ordered from the slowest-changing dimension to the fastest-changing dimension, which is different from Fortran order. The old library will transpose the data and not reverse the dimension sizes when it is called from Fortran programs. In many cases this is not a big problem. However, it must be taken care of by the application programmers when it does become a problem. 7. Scales, labels, units and formats for each dimension are written to the 3.2 HDF files in C order, from slowest to the fastest. When dfsdgetdimstrs(1, label, unit, format) is called from Fortran program, it will distinguish whether the SDS was written by 3.1 or 3.2. If the SDS was written by 3.1, no reverse will take place. The first values of label, unit and format (of the SDS) in the file will be returned; with 3.2 files, dfsdgetdimstrs() will return the last values of label, unit and format (of the SDS) in the file. For the same reason mentioned in 6, Fortran application programs need to be modified when they are using the HDF3.1 library to read 3.2 HDF files. The order of dimension-order- related information is by C order in 3.2 HDF files. Summary: 1. In HDF files written by Fortran programs using the HDF3.2 library, the data array is stored in the same order as it is in memory, while the orders of dimension-related information such as dimension sizes, scales, labels, units and formats are reversed comparing the dimension order in the declaration of the array. 2. The format of HDF3.2 Fortran interface for reading from and writing to an HDF file is the same as that of the HDF3.1 interface. 3. 3.2 reads SDS from HDF 3.1 files in the way 3.1 does so that old files can be read by new library without any extra programming. 4. When the HDF3.2 library is called by a Fortran program to read HDF3.2 SDS, it returns dimension-related information in the reversed order from that in the file. 5. When the HDF3.1 library is called by a Fortran program to read 3.2HDF SDS, it returns dimension-related information in the same order as that in the file. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++