Execute commands on multiple computers using GNU Parallel (setting up a cluster on the cheap)

I’ve mentioned before how awesome GNU Parallel[1] is for easily making use of multiple cores on a single machine. You can also use it to run commands on multiple machines if you have SSH access to them, and have set up SSH keys for password-less login (there is a guide to setting up SSH keys here https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys–2

For this example, we’ll assume I have set up SSH keys for three computers under my username ‘dan’. I create a text file ‘nodeslist’ with the IP of each machine and force the number of cores to use on each using the following format:

2/ dan@
4/ dan@
4/ dan@

You can tell parallel to use this file with the ‘–sshloginfile’ flag. As an example we can print out the hostname 4 time:

parallel --sshloginfile nodefile echo "Number {}: Running on \`hostname\`" ::: 1 2 3 4

This will produce an output something like:

Number 1: Running on dan-computer1
Number 4: Running on dan-computer2
Number 3: Running on dan-computer2
Number 2: Running on dan-computer3

Note, the commands won’t necessarily be executed in order.

For a more useful example, we can use gdal_translate to copy an image, keeping only the first band for all files matching ‘*tif’ in the current directory of your local machine. Each file needs to be copied to the remote machine (‘–transfer’) and the output returned (‘–return FILE’). The input and output files are removed after the command has completed (‘–cleanup’).

The total command looks something like:

ls *tif | parallel --sshloginfile nodefile \
     --dry-run \
     --transfer \
     --return {.}_b1.tif \
     --cleanup \
     gdal_translate -of GTiff -b 1 {} {.}_b1.tif

To print the commands, but not run them (to check everything looks OK) the ‘–dry-run’ flag is used. The output should be something like:

gdal_translate -of GTiff -b 1 image1.tif image1_b1.tif
gdal_translate -of GTiff -b 1 image2.tif image2_b1.tif
gdal_translate -of GTiff -b 1 image3.tif image3_b1.tif

The syntax ‘{.}_b1.tif’, takes the name of the input file, removes the extension and appends ‘_b1.tif’ on the end.

Running the command again, without the dry-run flag, will run the commands. The output from GDAL won’t be printed until the command has finished. Once all the commands are complete there will be a ‘*_b1.tif’ copy of every tif in the input directory.

There is an overhead to copying files to a different machines so this is only worthwhile if the commands you want to run are computationally intensive. There isn’t much benefit to using it for ‘gdal_translate’, but it makes a nice simple example to demonstrate the capability.

Further reading

[1] O. Tange (2011): GNU Parallel – The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s