Tag Archives: HDF5

Converting HDF5 format AirMOSS data to a GDAL image

As NASA’s preferred file format a lot of remote sensing data is now delivered in HDF5 format. However, as there are multiple ways of storing geospatial information within HDF5 reading the data with GDAL often requires some effort. Luckily with the h5py and GDAL Python libraries converting to another GDAL format is easy.

The instructions here are for reading Synthetic Aperture Radar (SAR) data from the AirMOSS mission, a polarimetric P-band system. More details about the AirMOSS mission and links to download data are available from airmoss.jpl.nasa.gov (registration required to download data). However, the general process should be similar for most HDF5 formats.

  1. Import file
  2. import h5py
    inHDF5File = "indata.h5"
    # Read in data
    indata = h5py.File(inHDF5File)
  3. Get some information about data
  4. # List keys
    for key in indata.keys():
    # List items
    for item in indata.items():
    # List attributes
    for att in indata.attrs:
  5. Extract data
  6. hhData = indata['HHHH']
    vvData = indata['VVVV']
    hvData = indata['HVHV']
  7. Get geospatial information
  8. For AirMOSS data the geospatial information are stored as attributes in the HDF5 file and are read using ‘indata.attrs’

    # Set image size
    inXSize = hhData.shape[1]
    inYSize = hhData.shape[0]
    # Get coordinates
    minLon = indata.attrs['northwest longitude'] 
    maxLat = indata.attrs['northwest latitude']
    # Get pixel resolution
    pixelX = indata.attrs['longitude spacing']
    pixelY = indata.attrs['latitude spacing']
  9. Create new GDAL file
  10. You can select any format GDAL can write to as the output format, I’ve chosen to stick with HDF5 and go with the KEA format (which provides an implementation of the full GDAL specification).

    # Set Geospatial Information
    geoTransform = []
    for i in range(6):
    geoTransform[0] = minLon # top left x 
    geoTransform[1] = pixelX
    geoTransform[2] = 0
    geoTransform[3] = maxLat # top left y 
    geoTransform[4] = 0
    geoTransform[5] = pixelY
    srs = osr.SpatialReference()
    # Creat output image
    numBands = 3
    gdalDriver = getGDALFormatFromExt(outFileName)
    driver = gdal.GetDriverByName(gdalDriver)
    metadata = driver.GetMetadata()
    newDataset = driver.Create(outFileName, inXSize, inYSize, numBands, gdal.GDT_Float32)
  11. Copy data to new file
  12. newDataset.GetRasterBand(1).WriteArray(hhData)
    # Close dataset
    newDataset = None

    The data should now be readable in anything which can read the output format you selected (e.g., TuiView.)

    The full script can be downloaded from https://github.com/MiXIL/AirMOSS-GDAL-Utilities/, there are some differences in the final script, a loop is used to read and write each band, rather than loading all the data to memory and the no data value of -9999 is changed to 0.