Multi-core processing with RSGISLib using tiles

Within RSGISLib there are functions within imageutils for tiling and mosaicking images. These can be combined to split large datasets up for processing using multiple cores or nodes on a HPC. The Python bindings for the createTiles function was written so it returns a list of all tiles created, this allows the list of tiles to be passed to a separate function for processing. Combining with the multiprocessing module in Python provides a simple way of processing large datasets on multiple cores with the createImageMosaic function called at the end to join the tiles back together.

The example below shows how to:

  1. Split an image into temporary tiles (in a directory created using the tempfile module).
  2. Run an RSGISLib function on each tile, in this case imageMath.
  3. Re-mosaic the tiles.
  4. Remove the temp files created.
# Import RSGISLib
import rsgislib
from rsgislib import imageutils
from rsgislib import imagecalc

# Import multiprocessing
from multiprocessing import Pool
# Import tempfile
import tempfile
# Import os
import os

outFormat = 'KEA'
outType = rsgislib.TYPE_32INT

def addOne(inImage):

    outputImage = inImage.replace('.kea','add1.kea')
    expression = 'b1+1'
    imagecalc.imageMath(inImage, outputImage,
                    expression, outFormat, outType)

    return outputImage

if __name__ == '__main__':

    inputImage = 'N06W053_PALSAR_08_HH_utm.kea'
    outImage = 'N06W053_PALSAR_08_HH_utm_addOne.kea'

    # Create temporary directory
    tempDIR = tempfile.mkdtemp(dir='.')
    outBase = os.path.join(tempDIR, 'tile')

    # Create Tiles
    width = 1000
    height = width
    overlap = 5 # Set a 5 pixel overlap between tiles
    offsettiling = 0
    ext='kea'
    temptiles = imageutils.createTiles(inputImage, outBase, width,
             height, overlap, offsettiling, outFormat, outType, ext)

    # Run process on tiles
    pool = Pool()
    temptilesP = pool.map(addOne, temptiles)

    # Mosaic tiles
    backgroundVal = 0.
    skipVal = 0.
    skipBand = 1
    overlapBehaviour = 0

    imageutils.createImageMosaic(temptilesP, outImage, backgroundVal,
               skipVal, skipBand, overlapBehaviour, outFormat, outType)

    # Remove temp tiles and DIR
    removeList = temptiles + temptilesP
    for tile in removeList:
        os.remove(tile)

    os.removedirs(tempDIR)

As all the functions try to write to stdout at the same time it looks a little messy (I still need to figure out a nice way to improve this). Also, although I didn’t run any tests the overhead of creating and mosaicking the tiles will likely make this image maths function (which is not particularly CPU intensive) take longer than running on the entire image in one go. However, it shows the potential for combining the RSGISLib Python bindings with the multiprocessing module.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s