# Converting HDF5 format AirMOSS data to a GDAL image

As NASA’s preferred file format a lot of remote sensing data is now delivered in HDF5 format. However, as there are multiple ways of storing geospatial information within HDF5 reading the data with GDAL often requires some effort. Luckily with the h5py and GDAL Python libraries converting to another GDAL format is easy.

The instructions here are for reading Synthetic Aperture Radar (SAR) data from the AirMOSS mission, a polarimetric P-band system. More details about the AirMOSS mission and links to download data are available from airmoss.jpl.nasa.gov (registration required to download data). However, the general process should be similar for most HDF5 formats.

1. Import file
2. import h5py

inHDF5File = "indata.h5"

indata = h5py.File(inHDF5File)


3. Get some information about data
4. # List keys
for key in indata.keys():
print(key)

# List items
for item in indata.items():
print(item)

# List attributes
for att in indata.attrs:
print(att)


5. Extract data
6. hhData = indata['HHHH']
vvData = indata['VVVV']
hvData = indata['HVHV']

7. Get geospatial information
8. For AirMOSS data the geospatial information are stored as attributes in the HDF5 file and are read using ‘indata.attrs’

# Set image size
inXSize = hhData.shape[1]
inYSize = hhData.shape[0]

# Get coordinates
minLon = indata.attrs['northwest longitude']
maxLat = indata.attrs['northwest latitude']

# Get pixel resolution
pixelX = indata.attrs['longitude spacing']
pixelY = indata.attrs['latitude spacing']

9. Create new GDAL file
10. You can select any format GDAL can write to as the output format, I’ve chosen to stick with HDF5 and go with the KEA format (which provides an implementation of the full GDAL specification).

# Set Geospatial Information
geoTransform = []
for i in range(6):
geoTransform.append(0.0)
geoTransform[0] = minLon # top left x
geoTransform[1] = pixelX
geoTransform[2] = 0
geoTransform[3] = maxLat # top left y
geoTransform[4] = 0
geoTransform[5] = pixelY

srs = osr.SpatialReference()
srs.SetWellKnownGeogCS("WGS84")

# Creat output image
numBands = 3
gdalDriver = getGDALFormatFromExt(outFileName)
driver = gdal.GetDriverByName(gdalDriver)
newDataset = driver.Create(outFileName, inXSize, inYSize, numBands, gdal.GDT_Float32)
newDataset.SetGeoTransform(geoTransform)
newDataset.SetProjection(srs.ExportToWkt())

11. Copy data to new file
12. newDataset.GetRasterBand(1).WriteArray(hhData)
newDataset.GetRasterBand(1).SetDescription('HH')

# Close dataset
newDataset = None


The data should now be readable in anything which can read the output format you selected (e.g., TuiView.)

The full script can be downloaded from https://github.com/MiXIL/AirMOSS-GDAL-Utilities/, there are some differences in the final script, a loop is used to read and write each band, rather than loading all the data to memory and the no data value of -9999 is changed to 0.

# Obtaining JAXA PALSAR Data and Forest / Non-Forest Maps

JAXA have recently released their global forest / non-forest map at 50 m resolution and the Advanced Land Orbiting Satellite (ALOS) Phased Array L-Band SAR (PALSAR) data from which they were derived. This is really exciting because SAR data provides a different view of the world than optical data, which we’re more used to viewing. A particularly interesting feature of L-band SAR for mapping vegetation is the ability to ‘see’ through clouds and the canopy layer of vegetation. A good introduction to SAR data, in the context of vegetation mapping, is provided in the following paper:

Rosenqvist, A., Finlayson, C. M., Lowry, J and Taylor, D., 2007. The potential of long- wavelength satellite-borne radar to support implementation of the Ramsar Wetlands Convention. Aquatic Conservation: Marine and Freshwater Ecosystems. 17:229–244.

http://www.eorc.jaxa.jp/ALOS/en/palsar_fnf/fnf_index.htm

You need to sign up for an account but this is a quick and straightforward process.

## Mosaic data

To mosaic all files in a 5 x 5 degree batch (or any number of files), you can use a combination of GNU Parallel to untar and gdalbuildvrt. Assuming we want to mosaic the HH- and HV-polarisation PALSAR data the following commands can be used:

# Untar file
tar -xf N60W105_07_MOS.tar.gz

# Change into directory
cd N60W105

# Untar all files, in parallel using GNU Parallel
ls *.gz | parallel tar xf

# Create a list of HH and HV files
ls *_HH > hhfilelist.txt
ls *_HV > hvfilelist.txt

# Build VRT
gdalbuildvrt -input_file_list hhfilelist.txt N60W105_HH.vrt
gdalbuildvrt -input_file_list hvfilelist.txt N60W105_HV.vrt



This will create virtual rasters, which are text files containing references to the data. You can convert to real rasters (KEA, geotiff etc.,) using gdal_translate:

gdal_translate -of GTiff 60W105_HH.vrt 60W105_HH.tif
gdal_translate -of GTiff 60W105_HV.vrt 60W105_HV.tif


## Calibrate data

The data are supplied as digital numbers, to calibrate and convert to dB, the following equation is used [1]:

$10 \times \log10 (DN^2) - 83.0$

You can run this calibration in RSGISLib using the following Python script:

 # Import RSGISLib
import rsgislib
from rsgislib import out image

# Set input and output image
inimage = 'N60W105_HH.vrt'
outimage = 'N60W105_HH_db.kea'

# Run image maths
imagecalc.imageMath(inimage, outimage, '10*log10(b1^2) - 83.0,
'KEA', rsgislib.TYPE_32FLOAT)

# Calculate stats and pyramids (for fast display)
imageutils.popImageStats(outimage,True,0.,True)



Alternatively you could grab the calPALSAR_dB.py script to perform the same calculation using RIOS (which will run under windows, see earlier post for more details)

SAR data takes a while to get your head into but once you do it provides a wealth of information.

Update – data are now available at 25 m resolution from the same place

[1] Shimada, M. and Ohtaki, T. 2010. Generating large-scale high-quality SAR mosaic datasets: Application to PALSAR data for global monitoring. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 3(4):637–656.