Create a CSV with the coordinates from geotagged photos

Recently I took a lot of photos on my phone during fieldwork (survey for the NERC-ARF LiDAR and hyperspectral calibration flights) and wanted to extract the GPS coordinates from each photo so I could load them into QGIS using the add delimited text dialogue. The aim wasn’t to have precise locations for each photo (GPS positions for each of the points surveyed were recorded separately) but to give a quick idea of the locations we’d visited before the main GPS data were processed.

I while ago (2012!) I wrote a script for this task. The script (CreateJPEGKMZ) requires pillow and the imagemagick command line tools.

It code isn’t particularly tidy (despite my recent updates) but it basically pulls out the GPS location from the EXIF tags and writes this to a CSV file. To create a KMZ a thumbnail image is created and a corresponding KML file. Both the thumbnail and KML are then zipped together to generate the KMZ which can be opened in GoogleEarth.

To use the script to write a CSV file: --outcsv gps_points.csv input_jpeg_files

To also create KMZ files for each photo: --outcsv gps_points.csv \
                 --outkmz output_kmz_files 

This was also a good lesson in the benefits of having scripts in version control and publicly available.


Scalable image segmentation using RSGISLib

To for application to very large remote sensing datasets, an approach to “Scalable image segmentation” presented in [1] using RSGISLib. In the paper a 30 m spatial resolution satellite mosaic of Australia was segmented by splitting into tiles, processing each tile on a separate node of a HPC, merging and then performing a second segmentation to remove artefacts at tile boundaries.
For a full description of the approach see section 5.3 of the open access paper (available to view online at

A version of this approach, designed to be used on a single machine, is now available in RSGISLib through the performTiledSegmentation function. The function is called in a similar way to the existing runShepherdSegmentation function.

For instructions on installing the latest version of RSGISLib see this post, skipping to section 7 if you already have Linux or OS X installed.

The code below presents an example of applying either the tiled or standard segmentation through the use of a command line flag:

#!/usr/bin/env python
Apply segmentation to image

from __future__ import print_function
import argparse
import shutil
import sys
import tempfile
from rsgislib.segmentation import tiledsegsingle
from rsgislib.segmentation import segutils

# Set values for clustering

# Set values for tiles segmentation

parser = argparse.ArgumentParser()
parser.add_argument("inputimage", nargs=1,
                    type=str, help="Input Image")
parser.add_argument("outputclumps", nargs=1,
                    type=str, help="Output clumps")
                    help="Use tiled segmentation")
args = parser.parse_args()

# Make temp directory for intermediate files
temp_dir = tempfile.mkdtemp(prefix='rsgislib_seg_')

if args.tiled:
    # If requested run tiled segmentation
                                             sampling=100, kmMaxIter=200)
    # If not run standard
                                      sampling=100, kmMaxIter=200)

Note, script edited for post – full version available from GitHub.

[1] Clewley, D.; Bunting, P.; Shepherd, J.; Gillingham, S.; Flood, N.; Dymond, J.; Lucas, R.; Armston, J.; Moghaddam, M. A Python-Based Open Source System for Geographic Object-Based Image Analysis (GEOBIA) Utilizing Raster Attribute Tables. Remote Sensing 2014, 6, 6111-6135.

Working with hyperspectral data in ENVI BIL format using Python

For working with ENVI files I normally use GDAL as code can then be applied to different formats. However, there are a couple of limitations with GDAL when working with hyperspectral data in ENVI format:

  1. GDAL doesn’t copy every item from the header file to a new header file if they don’t fit in with the GDAL data model. Examples are FWHM and comments. Sometimes extra attributes are copied to the aux.xml file GDAL creates, these files aren’t read by ENVI or other programs based on IDL (e.g., ATCOR).
  2. For data stored Band Interleaved by Line (BIL) rather than Band Sequential (BSQ) reading and writing a band at a time is inefficient as it is necessary to keep jumping around the file

To overcome these issues NERC-ARF-DAN use their own Python functions for reading / writing header files and loading BIL files a line at a time. These functions have been tidied up and released through the NERC-ARF Tools repository on GitHub ( The functions depend on NumPy.

To install them it is recommended to use the following steps:

  1. Download miniconda from and follow the instructions to install.
  2. Open a ‘Command Prompt’ / ‘Terminal’ window and install numpy by typing:
    conda install numpy
  3. Download ‘arsf_tools’ from GitHub (
  4. Unzip and within a ‘Command Prompt’ or ‘Terminal’ window navigate to the the folder using (for example):
    cd Downloads\arsf_tools-master
  5. Install the tools and library by typing:
    python install

Note, if you are using Linux you can install the arsf_binary reader from which is written in C++. The ‘arsf_envi_reader’ module will import this if available as it is faster than the standard NumPy BIL reader.

If you are a UK based researcher with access to the JASMIN system the library is already installed and can be loaded using:

module load contrib/arsf/arsf_tools

An simple example of reading each line of a file, adding 1 to every band and writing back out again is:

from arsf_envi_reader import numpy_bin_reader
from arsf_envi_reader import envi_header

# Open file for output
out_file = open("out_file.bil", "w")

# Open input file
in_data = numpy_bin_reader.BilReader("in_file.bil")

for line in in_data:
out_line = line + 1

# Copy header

# Close files
in_data = None

A more advanced example is applying a Savitzky-Golay filter to each pixel. As the filter requires every band for each pixel it is efficient to work with BIL files.

For developing algorithms using spatial data, in particular multiple files it is recommended to convert the files to another band-sequential format using the ‘gdal_translate’ so they can be read using programs which use GDAL, for example RIOS or RSGISLib.

To convert files from ENVI BIL to ENVI BSQ and copy all header attributes the ‘envi_header’ module can be used after converting the interleave with GDAL. For example:

import os
import subprocess
from arsf_envi_reader import envi_header

# Set input and output image (could get with argparse)
input_image = "input.bil"
output_image = "output_bsq.bsq"

# Get interleave from file extension
output_interleave = os.path.splitext(output_image)[-1].lstrip(".")

# 1. Convert interleave
print("Converting interleave to {}".format(output_interleave.upper()))
gdal_translate_cmd = ["gdal_translate",
                      "-of", "ENVI",
                      "-co", "INTERLEAVE={}".format(output_interleave.upper())]
gdal_translate_cmd.extend([input_image, output_image])

# 2. Copy header (GDAL doesn't copy all items)
# Find header files
input_header = envi_header.find_hdr_file(input_image)
output_header = envi_header.find_hdr_file(output_image)

# Read input header to dictionary
input_header_dict = envi_header.read_hdr_file(input_header)

# Change interleave
input_header_dict["interleave"] = output_interleave.upper()

# Write header (replace one generated by GDAL)
envi_header.write_envi_header(output_header, input_header_dict)

Useful GDAL commands for working with hyperspectral data

The Geospatial Data Abstraction Library (GDAL) contains a number of utilities for working with remote sensing data. The following are some commands which are particularly useful when working with airborne hyperspectral data.

Create subset and change format

To create a band and spatial subset of a larger hyperspectral dataset the following command can be used:

gdal_translate -of GTiff -co COMPRESS=LZW \
 -projwin 481500 766000 482500 765000 \
 -b 190 -b 40 -b 15  \
 outputs/f281013b_utm_wgs84N50_mapped_subsection.bil \
  • -of: Output Format – we are using GeoTiff file. For all available options see
  • -co: Creation Option – specifies the creation options, these are specific to the format. For GeoTiff we use compression to reduce the file size. For ENVI we can use the creation options to specify the Interleave (BSQ or BIL)
  • -projwin: A rectangle delineating an area to be extracted, in the format min X min Y max X max Y
  • -b: A given band to extract
  • final two arguments: input and output filenames

Create mosaic of multiple flightlines

Often a mosaic of all individual flightlines is required. Due to the large sizes of the mapped files this can often be time consuming. One option is to first make a ‘Virtial Raster’ using the ‘gdalbuildvrt’ command:

gdalbuildvrt -srcnodata 0 outputs/f2810_mapped_mosaic.vrt \
 outputs/f281013b_utm_wgs84N50_mapped_3band.bil \
  • -srcnodata: No data value of input lines
  • first argument: output VRT file
  • final arguments: list of input files

Rather than creating a new file a small .vrt text file is created which contains links to the individual lines.

Many programs which use GDAL (e.g., TuiView) can open this file directly. To make opening the file easier we can do some optimisation. First pre-calculate statistics using:

gdalinfo -stats outputs/f2810_mapped_mosaic.vrt
  • -stats: Calculate statistics and save to .aux.xml file
  • final argument: file name

The gdalinfo command can also be used to print information about a raster. Note for the .vrt file it shows the individual files which are included.

After creating statistics generate overviews (lower resolution copies at different resolutions) using gdaladdo:

gdaladdo -r average outputs/f2810_mapped_mosaic.vrt 2 4 8 16
  • -r average: Use average for resampling overviews
  • first argument: file name
  • numbers: overview levels to build (2x, 4x, 8x and 16x)

We can also make a real raster using gdal_translate (as shown previously).

Extract pixel values for a given location

gdallocationinfo -valonly -geoloc \
 outputs/f281013b_utm_wgs84N50_mapped_subsection.bil \
 481801 765289
  • -valonly: Only print values to the screen
  • -geoloc: Interprerate x and y values as coordinates rather than pixel and line
  • first argument: file name
  • x y: Easting and Northing of pixel to extract statistics from

This will print the values to the screen. To save as a text file we can redirect the output using ‘>’

gdallocationinfo -valonly -geoloc \
 outputs/f281013b_utm_wgs84N50_mapped_subsection.bil \
 481801 765289 > outputs/extracted_values.txt

This text file can be opened in other programs for further analysis.

This post is based on the hyperspectral practical session from the NERC Airborne Survey and Research Facility workshop held at Plymouth Marine Laboratory, March 2016. For more information about the ARSF Data Analysis Node (ARSF-DAN) and details about future training opportunities follow @ARSF_DAN on twitter. Hyperspectral data acquired by ARSF are available to download from NEODC.

Add all scripts within a repository to $PATH using envmaster

I have a couple of general scripts repositories for myself and shared with colleagues. These are for scripts which are useful but don’t fit into existing projects or justify having their own repository. An example is rsgis_scripts on bitbucket. The scripts are split into different directories which aren’t available on the main path. To make them available when needed I use EnvMaster (described in a previous post). I have an envmaster module which will search the repository for folders containing executables and add them to $PATH. It will also add folders containing ‘’ to $PYTHONPATH.

import os
import glob
REPO_PATH = "/home/dan/Documents/Development/rsgis_scripts"
# Walk through directory
for d_name, sd_name, f_list in os.walk(REPO_PATH):
    # Ignore hidden directories
    if not d_name.startswith(".") and not ".git" in d_name and not ".hg" in d_name:
        file_list = glob.glob(os.path.join(d_name,"*"))
        # Check if a directory contains executable files
        if True in [(os.path.isfile(f) and os.access(f, os.X_OK)) for f in file_list]:
        # Check for Python libraries
        if len(glob.glob(os.path.join(d_name,"*",""))) > 0:

The script is saved as ‘rsgis_scripts’ in ‘$ENVMASTERPATH’.
To load the module and prepend all the folders to ‘$PATH’ use:

envmaster load rsgis_scripts

They will remain on $PATH until you close the terminal or unload the module using:

envmaster unload rsgis_scripts

This script is just an example, it would be easy to modify for different use cases. If you have multiple repositories a module file can be created for each one.

Attribute LAS points using hyperspectral data

A good way of visualising LiDAR point clouds is to attribute them with colour information.
LAS files of point type 2 and above have the ability to store RGB colour information. These can be viewed with programs such as and CloudCompare.

To add RGB colour information to a LAS file ARSF-DAN have made a Python script available from:

The script uses a combination of laspy (to read and write a LAS file) and GDAL (to read the raster) and is capable of working with images with more than 3 bands such as multispectral or hyperspectral files without creating a three band copy of the image first.

For this post I’ll demonstrate running the script on the JASMIN/CMEMS system. In a nutshell it is a Linux computing facility with lots of fast storage and direct access to many datasets, including the ARSF archive of airborne LiDAR and Hyperspectral data (apply for access to ARSF data here). Researchers in the UK who are part of the NERC or National Centre for Earth Observation (NCEO) scientific communities can apply for access. If you don’t have access to JASMIN you can run on your own machine just ignore the JASMIN specific steps.

  1. Apply for access and logon to JASMIN
    If you are using JASMIN the first stage is to get an account and log on by following the steps in this guide:
  2. Set up scripts
    Once you have logged onto one of the shared VMs (e.g., you need to load laspy, which isn’t installed by default. You can do this using:

    module load contrib/arsf/laspy/1.3.0

    Then checkout the repository containing the ‘’ script using:

    git clone

    Check the script runs OK by typing:

    python2.7 arsf_tools/ -h

    Note, you need to specify Python 2.7 as the default system Python is 2.6 but most of the Python libraries (e.g., GDAL) are build against 2.7. If you are not running the scripts on JASMIN just using ‘python’ should be fine.

  3. Colour using ARSF hyperspectral dataFor this example data from Mont Blanc flown as part of a EUFAR campaign in 2015 on Julian day 175a (24/06/2015) will be used. This data can be downloaded from NEODC (direct link).
    python2.7 arsf_tools/ \
              --image /vsizip/$HS_PATH/ \
              --red 38 \
              --green 25 \
              --blue 12 \
              --scale \
             $LIDAR_PATH/LDR-EUFAR15_28-2015-175a-18.LAS \

    This will attribute using bands 38, 25 and 12 (true colour) and will scale the pixel values from 16 bit to 8 bit (0 – 255) using a standard deviation stretch.To read the hyperspectral data without unzipping a GDAL virtual file system is used, as described in a previous post.

  4. LASzip files and Download
    To reduce the size of the files before downloading you can use LASzip. On JASMIN LAStools can be loaded using:

    module load contrib/arsf/lastools/20150925

    To compress the LAS file and drop points flagged as noise the following command can be used:

    las2las -drop_class 7 -i LDR-EUFAR15_28-2015-175a-18_col.LAS -o LDR-EUFAR15_28-2015-175a-18_col.laz

    A similar command can be used to decompress files.

    See the following guide on transferring data off JASMIN:

  5. View using
  6. Open in a modern browser and click ‘Browse’ to load in the coloured LAS file.
    The nice thing about using and JASMIN is no specialist software needs to be installed on your local machine, just an ssh client and a web browser.

  7. View using Cloud Compare Viewer
  8. If you want to view files locally you can use Cloud Compare, which is available for Windows, OS X and Linux. To open the coloured LAS/LAZ file in the cloud compare viewer just drag it into the viewer window.


  9. Additional – Colour using an existing three band image
    If you already have a three band image, with pixel values 0 – 255 a simpler command can be used: --image in_raster_file_rgb.tif \
                  in_las_file.las out_las_file.las

Convert EASE-2 grid cell to latitude and longitude using Python

The EASE-2 grid is used for a number of NASA datasets including SMAP. It is described in the following paper:

Brodzik, M. J., Billingsley, B., Haran, T., Raup, B., & Savoie, M. H. (2012). EASE-Grid 2.0: Incremental but Significant Improvements for Earth-Gridded Data Sets. ISPRS International Journal of Geo-Information, 1(3), 32–45.

Files with the centre coordinate of each cell for EASE-2 grids at different resolutions are available from as well as tools for conversion.

To read these files into Python the following steps can be used:

  1. Download the relevant gridloc file.

    The FTP link for the grid location files is:

    For this example I’ve chosen the 36km cylindrical EASE-2 grid (gridloc.EASE2_M36km.tgz)

  2. Un-tar using:
    tar -xvf gridloc.EASE2_M36km.tgz
  3. Read the files into Python:
    import numpy
    # Read binary files and reshape to correct size
    # The number of rows and columns are in the file name
    lats = numpy.fromfile('EASE2_M36km.lats.964x406x1.double', 
    lons = numpy.fromfile('EASE2_M36km.lons.964x406x1.double', 
    # Extract latitude and longitude
    # for a given row and column 
    grid_row = 46
    grid_column = 470
    lat_val = lats[grid_row, grid_column]
    lon_val = lons[grid_row, grid_column]