Category Archives: General Remote Sensing

Convert Sentinel-2 Data Using GDAL

The latest version of GDAL (2.1) has a driver to read Sentinel 2 data (see http://gdal.org/frmt_sentinel2.html). Like HDF files they are read as subdatasets. Running ‘gdalinfo’ on the zipped folder or the .xml file contained within the .SAFE directory will display all the subdatasets, as well as all the metadata (so quite a lot of information!).

As with HDF5 you can pass the subdataset names to gdalinfo to get more information or gdal_translate to extract them as a separate dataset.

To make it easier to extract all the subdatasets I wrote a script (extract_s2_data.py) which can be downloaded from the https://bitbucket.org/petebunting/rsgis_scripts repository.

The script will get a list of all subdatasets using GDAL:

from osgeo import gdal
dataset = gdal.Open('S2/S2.xml', gdal.GA_ReadOnly)
subdatasets = dataset.GetSubDatasets()
dataset = None

The ones to be extracted are for the 10, 20 and 60 m resolution band groups for each UTM zone (if the file crosses multiple zones).

For each subdataset it will give an output name, replacing the EPSG code with the UTM zone and ‘:’ with ‘_’.

Then the gdal_translate command is used to create a new file for each. By default the output is KEA format, called using subprocess.

To run the script first install GDAL 2.1, the conda-forge channel has recent builds, to install them using conda:

conda create -n gdal2 -c conda-forge gdal
source activate gdal2

(If you are on Windows leave out ‘source’)

To extract all subdatasets from a zipped Sentinel 2 scene to the current directory you can then use:

extract_s2_data.py -o . \
S2A_OPER_PRD_MSIL1C_PDMC_20151201T144038_R010_V20151130T142545_20151130T142545.zip

The gdal_translate command used is printed to the screen.

The default output format is KEA, you can change using the ‘–of’ flag. For example to convert an unzipped scene to GeoTiff:

extract_s2_data.py -o . --of GTiff \
S2A_OPER_PRD_MSIL1C_PDMC_20151201T144038_R010_V20151130T142545_20151130T142545.SAFE

To get the extension for all supported drivers, and some creation options the ‘get_gdal_drivers’ module from arsf_dem_scripts is optionally used. You can just download this file and copy into the same directory ‘extract_s2_data.py’ has been saved to. For Linux or OS X you can run:

# OS X
curl https://raw.githubusercontent.com/pmlrsg/arsf_dem_scripts/master/arsf_dem/get_gdal_drivers.py > get_gdal_drivers.py

# Linux
wget https://raw.githubusercontent.com/pmlrsg/arsf_dem_scripts/master/arsf_dem/get_gdal_drivers.py

Useful GDAL commands for working with hyperspectral data

The Geospatial Data Abstraction Library (GDAL) contains a number of utilities for working with remote sensing data. The following are some commands which are particularly useful when working with airborne hyperspectral data.

Create subset and change format

To create a band and spatial subset of a larger hyperspectral dataset the following command can be used:

gdal_translate -of GTiff -co COMPRESS=LZW \
 -projwin 481500 766000 482500 765000 \
 -b 190 -b 40 -b 15  \
 outputs/f281013b_utm_wgs84N50_mapped_subsection.bil \
 outputs/f281013b_utm_wgs84N50_mapped_subsection_3band.tif
  • -of: Output Format – we are using GeoTiff file. For all available options see http://www.gdal.org/formats_list.html
  • -co: Creation Option – specifies the creation options, these are specific to the format. For GeoTiff we use compression to reduce the file size. For ENVI we can use the creation options to specify the Interleave (BSQ or BIL)
  • -projwin: A rectangle delineating an area to be extracted, in the format min X min Y max X max Y
  • -b: A given band to extract
  • final two arguments: input and output filenames

Create mosaic of multiple flightlines

Often a mosaic of all individual flightlines is required. Due to the large sizes of the mapped files this can often be time consuming. One option is to first make a ‘Virtial Raster’ using the ‘gdalbuildvrt’ command:

gdalbuildvrt -srcnodata 0 outputs/f2810_mapped_mosaic.vrt \
 outputs/f281013b_utm_wgs84N50_mapped_3band.bil \
 outputs/f281023b_utm_wgs84N50_mapped_3band.bil
  • -srcnodata: No data value of input lines
  • first argument: output VRT file
  • final arguments: list of input files

Rather than creating a new file a small .vrt text file is created which contains links to the individual lines.

Many programs which use GDAL (e.g., TuiView) can open this file directly. To make opening the file easier we can do some optimisation. First pre-calculate statistics using:

gdalinfo -stats outputs/f2810_mapped_mosaic.vrt
  • -stats: Calculate statistics and save to .aux.xml file
  • final argument: file name

The gdalinfo command can also be used to print information about a raster. Note for the .vrt file it shows the individual files which are included.

After creating statistics generate overviews (lower resolution copies at different resolutions) using gdaladdo:

gdaladdo -r average outputs/f2810_mapped_mosaic.vrt 2 4 8 16
  • -r average: Use average for resampling overviews
  • first argument: file name
  • numbers: overview levels to build (2x, 4x, 8x and 16x)

We can also make a real raster using gdal_translate (as shown previously).

Extract pixel values for a given location

gdallocationinfo -valonly -geoloc \
 outputs/f281013b_utm_wgs84N50_mapped_subsection.bil \
 481801 765289
  • -valonly: Only print values to the screen
  • -geoloc: Interprerate x and y values as coordinates rather than pixel and line
  • first argument: file name
  • x y: Easting and Northing of pixel to extract statistics from

This will print the values to the screen. To save as a text file we can redirect the output using ‘>’

gdallocationinfo -valonly -geoloc \
 outputs/f281013b_utm_wgs84N50_mapped_subsection.bil \
 481801 765289 > outputs/extracted_values.txt

This text file can be opened in other programs for further analysis.

This post is based on the hyperspectral practical session from the NERC Airborne Survey and Research Facility workshop held at Plymouth Marine Laboratory, March 2016. For more information about the ARSF Data Analysis Node (ARSF-DAN) and details about future training opportunities follow @ARSF_DAN on twitter. Hyperspectral data acquired by ARSF are available to download from NEODC.

Copy information between ENVI header files

The header files (.hdr) of ENVI files are text files which contain the size of the file (pixels and lines) geospatial information and other meta data such as wavelengths, data units etc.,

Not all programs are aware of all items in then header file and some may become lost during processing. ARSF-DAN have made a script available which can copy attributes from one header to another. The script can be downloaded from https://github.com/pmlrsg/arsf_tools and is run using:

copy_header_info.py f221013_mapped_osng.bil.hdr \
                      f221013_nadir_mapped_osng.hdr \
                      -k "map info" "projection info" "wavelength"

To show the available fields the following script (also from the arsf_tools repo) can be used:

get_info_from_header.py f221013_mapped_osng.bil.hdr

Create a GIF from a series of images

Creating a video is a cool way to visualise a time series of remote sensing data / outputs. Rather than creating a full video you can also use a GIF. One advantage of GIFs is you can embed them in Powerpoint presentations and they should will play in presentation mode (I’ve found them more reliable than videos in other formats).

You can use the ImageMagick convert command to create a GIF:

convert -delay 20 -loop 0 in_files/*.png out_animation.gif

However, depending on how your files are named it might not sort the numbers as expected leading to jumps in the video. This problem was addressed in the following post on the ‘remotelysensible’ blog:

https://remotelysensible.wordpress.com/2014/09/18/alphanumeric-string-sorting-in-python/

This can be used to sort the files in Python then call the convert command using subprocess:

#!/usr/bin/env python

import argparse
import re
import subprocess

re_natural = re.compile('[0-9]+|[^0-9]+')

def natural_key(s):
   """
   Natural sorting function from http://stackoverflow.com/questions/12184015/in-python-how-can-i-naturally-sort-a-list-of-alphanumeric-strings-such-that-alp
   """
   return [(1, int(c)) if c.isdigit() else (0, c.lower()) for c in re_natural.findall(s)] + [s]

if __name__ == '__main__':
   parser = argparse.ArgumentParser(description='Create an animated gif from a series of images')
   parser.add_argument("images", nargs='+',type=str, help="Input images")
   parser.add_argument('-o', '--outgif',
                       metavar ='Output gif',
                       help ='Output name for gif',
                       required=True)
   args=parser.parse_args()

   sorted_png_files = sorted(args.images, key=natural_key)

   convert_cmd = ['convert','-delay','20','-loop','0']
   convert_cmd.extend(sorted_png_files)
   convert_cmd.extend([args.outgif])

   subprocess.call(convert_cmd)

The script is run using:

python create_gif.py -o out_animation.gif in_files/*.png 

As well as creating GIFS of a time series you can use SPD 3D Points Viewer to export images of a point cloud as it rotates and create a gif from these similar to this video of airborne LiDAR data over Lake Vyrnwy on YouTube: https://www.youtube.com/watch?v=P_9998KJib4

The script can be downloaded from GitHub. Note the delay is hardcoded at 20/100th of a second but this can be changed if needed.

Batch processing atmospheric correction using ARCSI

Previous posts have covered atmospheric correction using the Atmospheric and Radiometric Correction of Satellite Imagery (ARCSI) software. When a large number of scenes require processing then commands to automate steps are desirable and greatly simplify the process.

ARCSI provides a number of commands which are useful for batch processing:

  • arcsisortlandsat.py sorts the landsat ‘tar.gz’ files into individual directories for each sensor (i.e., LS 5 TM) and also builds a standard directory structure for processing.
  • arcsiextractdata.py extracts the contents of \verb|tar| and \verb|tar.gz| archives with each archive being extracted into an individual directory.
  • arcsibuildcmdslist.py builds the ‘arcsi.py’ commands for each scene creating a shell script.

The files used for this tutorial are shown below, but others could be downloaded from from earthexplorer and used instead.

LC80720882013174LGN00.tar.gz	LT52040232011310KIS00.tar.gz
LC82040242013139LGN01.tar.gz	LT52040242011118KIS00.tar.gz
LT52040232011118KIS00.tar.gz	LT52050232011109KIS00.tar.gz

Sort Landsat Scenes

The first command to execute is ‘arcsisortlandsat.py’ which needs to be executed from within the directory containing files:

arcsisortlandsat.py -i . -o .

This will generate the following directory structure and move the files into the appropriate directories.

> ls *
LS5:
Inputs	Outputs	RAW	tmp

LS8:
Inputs	Outputs	RAW	tmp

> ls */RAW
LS5/RAW:
LT52040232011118KIS00.tar.gz	LT52040242011118KIS00.tar.gz
LT52040232011310KIS00.tar.gz	LT52050232011109KIS00.tar.gz

LS8/RAW:
LC82040242013139LGN01.tar.gz	LC82060242015047LGN00.tar.gz
LC82060232015047LGN00.tar.gz

Extract Data

To extract the data from the ‘tar.gz’ files the ‘arcsiextractdata.py’ command is used as shown below:

arcsiextractdata.py -i ./LS5/RAW/ -o ./LS5/Inputs/

arcsiextractdata.py -i ./LS8/RAW/ -o ./LS8/Inputs/

Once the files are extracted the directory structure will look like the following:

> ls */RAW
LS5/RAW:
LT52040232011118KIS00.tar.gz	LT52040242011118KIS00.tar.gz
LT52040232011310KIS00.tar.gz	LT52050232011109KIS00.tar.gz

LS8/RAW:
LC80720882013174LGN00.tar.gz	LC82040242013139LGN01.tar.gz

> ls */Inputs
LS5/Inputs:
LT52040232011118KIS00	LT52040242011118KIS00
LT52040232011310KIS00	LT52050232011109KIS00

LS8/Inputs:
LC82040242013139LGN01	LC82060242015047LGN00
LC82060232015047LGN00

Build ARCSI Commands

To build the ‘arcsi.py’ commands, one for each input file, the following commands are used. Notice that these are very similar to the individual commands that you previously executed but now provide inputs to the ‘arcsibuildcmdslist.py’ command which selects a number of input files and generate a single shell script output.

arcsibuildcmdslist.py -s ls5tm -f KEA --stats -p RAD DOSAOTSGL SREF \
          --outpath ./LS5/Outputs --aeroimg ../WorldAerosolParams.kea \
          --atmosimg ../WorldAtmosphereParams.kea --dem ../UKSRTM_90m.kea \
          --tmpath ./LS5/tmp --minaot 0.05 --maxaot 0.6 --simpledos \
          -i ./LS5/Inputs -e MTL.txt -o LS5ARCSICmds.sh

arcsibuildcmdslist.py -s ls8 -f KEA --stats -p RAD DOSAOTSGL SREF \
        --outpath ./LS8/Outputs --aeroimg ../WorldAerosolParams.kea \
        --atmosimg ../WorldAtmosphereParams.kea --dem ../UKSRTM_90m.kea \
        --tmpath ./LS8/tmp --minaot 0.05 --maxaot 0.6 --simpledos \
        -i ./LS8/Inputs -e MTL.txt -o LS8ARCSICmds.sh

Following the execution of these commands the following two files will have been created ‘LS5ARCSICmds.sh’ and ‘LS8ARCSICmds.sh’. These files contain the ‘arcsi.py’ commands to be executed. Open the files and take a look, you will notice that all the file paths have been convert to absolute paths which means the file can be executed from anywhere on the system as long as the input files are not moved.

Execute ARCSI Commands

To execute the ‘arcsi.py’ commands the easiest methods is to run each in turn using the following command:

sh LS5ARCSICmds.sh

sh LS8ARCSICmds.sh

This will run each of the commands sequentially. However, most computers now have multiple processing cores and to take advantage of those cores we can use the GNU parallel command line tool. Taking advantage of those cores means that processing can be completed much quicker and more efficiently.

cat LS5ARCSICmds.sh LS8ARCSICmds.sh | parallel -j 2 

The switch ‘-j 2’ specifies the number of processing cores which can be used for this processing. If no switches are provided then all the cores will be used. As some parts of the process involve reading / writing large files, depending on the speed of you hard drive and the number of cores you have available you might find limiting the number of cores is needed. Please note that until all processing has completed nothing will be printed to the console. You can check the command are running using the top / htop command.

Once you have completed your processing you should clean up your system to remove any files you don’t need for any later processing steps. In most cases you will only need to keep the ‘tar.gz’ (so you can reprocess the RAW data at a later date if required) and the SREF product. It is recommended that you also retain the scripts you used for processing and a record of the commands you used for a) reference if you need to rerun the data and b) as documentation for the datasets so you know what parameters and options were used.

This post was adapted from notes for a module taught as part of the Masters course in GIS and Remote Sensing at Aberystwyth University, you can read more about the program here.

Apply the same stretch to multiple files in TuiView

There are some cases when it is necessary to play around with the stretch used to display data in TuiView (edit -> stretch). For example, if the scene contains very bright or dark areas (e.g., clouds/shadows) this can skew the statistics and make the images display too light/dark. One option is to use a local stretch (round house icon), which will only use values from the current display to calculate statistics. You can also manually set the minimum and maximum values.

Once a good stretch has been obtained this can be saved to the image for formats which support it (e.g., KEA) using the ‘Save Stretch and Lookup Table’ button (disk icon). If multiple files need to be opened with the same stretch applied you can pass in a file with a saved stretch when opening TuiView e.g.,

tuiview --stretchfromgdal image_with_saved_stretch.kea *kea

This will open all files matching the pattern ‘*bil’, applying the stretch saved in ‘image_with_saved_stretch.kea’

You can also save a stretch to a text file using the ‘Export Stretch and Lookup Table to Text File’ button (disk with ABC icon), this can be passed in when opening TuiView using

tuiview --stretchfromtext casi.stretch *bil

This will open all files matching the pattern ‘*bil’, applying the stretch saved in the text file ‘casi.stretch’

Inverting for Aerosol Optial Thickness (AOT) using ARCSI

Aerosol Optial Thickness (AOT) as an important parameter in atmospheric correction models such as 6S, which is used by the Atmospheric and Radiometric Correction of Satellite Imagery (ARCSI) software (see previous post for more details). With time and expert knowledge it would be possible to derive an AOT value for each scene manually. In some regions of the world it has been found that a constant can be used (e.g., Australia use an AOT of 0.05; [1]).

However, for most regions a constant is not viable as the atmosphere is too variable (both temporally and spatially) and not correctly parameterising for AOT can lead to large errors in atmospheric correction [2].

ARCSI provides a method of deriving AOT from an image by using a dark object subtraction (DOS) to estimate surface reflectance in the blue channel. Taking this estimated reflectance as input, 6S is then numerically inverted to identify an AOT value which provides a surface reflectance value as close to that estimated from DOS as possible.

To execute this functionality the following command is used:

arcsi.py --sensor ls8 \
    -i Inputs/LC82040242013139LGN01/LC82040242013139LGN01_MTL.txt \
    -o ./OutputsAOTInv \
    --tmpath ./tmp \
    -f KEA --stats \
    --prodlist RAD DOSAOTSGL SREF \
    --aeroimg ../WorldAerosolParams.kea \
    --atmosimg ../WorldAtmosphereParams.kea \
    --dem ../UKSRTM_90m.kea \
    --minaot 0.05 \
    --maxaot 0.6 --simpledos 

Where:

--sensor ls8
specifies the sensor (in this example Landsat 8)

-i Inputs/LC82040242013139LGN01/LC82040242013139LGN01_MTL.txt
is the header file of the input image.

-o ./Outputs
specifies the output directory where all output file will be saved to.

--tmpath ./tmp
is a directory where temporary files can be outputted during the processing, these will be deleted afterwards.

--f KEA
specifies that the output image file format should be KEA. Any GDAL supported format can be used.

-–stats
specifies that the output images should be populate with statistics and pyramids – makes display much faster (only available for KEA and HFA output formats).

--prodlist RAD TOA SREF
specifies the output products to be generated. You only need to specify what you want, e.g., SREF but if other products (e.g., RAD) are required then these will also be produced even if they are not specified.

--aeroimg ../WorldAerosolParams.kea
is an image file which is sampled to identify the aerosol model to be used for the input image (this can be downloaded from ARCSI downloads).

–atmosimg ../WorldAtmosphereParams.kea
is an image file which is sampled to identify the atmosphere model to be used for the input image (this can be downloaded from ARCSI downloads).

–dem ../UKSRTM_90m.kea
is an elevation model covering the scene, in this case the 90 m SRTM product. The higher the resolution of the DEM available the better. SRTM data are available to download from earthexplorer.

--minaot 0.05
is the lower limit of the AOT values tested when estimating the AOT value to be used to correct the scene.

--maxaot 0.6
is the upper limit of the AOT values tested when estimating the AOT value to be used to correct the scene.

--simpledos
specifies that a simple single valued DOS method should be used to provide an estimate of the Blue SREF used in the inversion.

This command will estimate a value for AOT, use this to parameterise and run 6S to build a LUT at elevation steps of 100 m and apply this to produce surface reflectance (multiplied by a scaling factor of 1000 to reduce the file size).

This post was adapted from notes for a module taught as part of the Masters course in GIS and Remote Sensing at Aberystwyth University, you can read more about the program here.

References
[1] S.S. Gillingham , N. Flood , T.K. Gill. 2013. On determining appropriate aerosol optical depth values for atmospheric correction of satellite imagery for biophysical parameter retrieval: requirements and limitations under Australian conditions; International Journal of Remote Sensing. Vol. 34(6). 2089-2100. http://dx.doi.org/10.1080/01431161.2012.738945
[2] Wilson, R.T., Milton, E.J. & Nield, J.M., 2014. Spatial variability of the atmosphere over southern England, and its effect on scene-based atmospheric corrections. International Journal of Remote Sensing. 35(13). 5198-5218. http://dx.doi.org/10.1080/01431161.2014.939781 (‘Behind the paper’ and pre-print link).