Tag Archives: backup

Backup with rsync

There are a lot of great tools out there for back up. However, for large remote sensing datasets not all are appropriate. I use the command line tool rsync to backup my data to external hard drives. It only copies data that has changed making it efficient to run.

The command I use is:


rsync -r -u  -p -t --delete --force --progress \

/data/Australia/ /media/Backup1/Australia/

This will recursively search (-r), updating only files that have changed (-u) preserving permissions (-p) and time stamp (-t). Files that are on the back up drive that no longer exist are removed (–delete), including directories (–force). As it can take a while to run, I print the progress to the screen (–progress).

My system is to backup the data on my office computer regularly (weekly or after getting / processing new data) to external hard drives, which I keep at home. To save having to remember the command I have a shell script (backup.sh) in the root directory of the external hard drives. This system works well for me as I my lab has a NAS drive all the data is stored on and all my scripts are stored separate from the data and backed up a lot more regularly.

If you leave your external hard drive connected to your computer you can create a cron job to run your backup script at regular intervals. To create a cron job use:


crontab -e

This will open a text file for editing, the comments (lines beginning with #) explaining the format. To backup up at 4 pm every day, add the following line.


0 16 * * * sh /media/Backup1/backup.sh

Remembering to change path of the backup script. You can change the time (second number) as needed. You can duplicate the line and set a different time to run twice (e.g., in the morning and afternoon).

There are many options for rsync so you can customise the backup to suit your requirements. To see these options type:


rsync --help

As rsync only updates files that have changed and preserves the time stamp, I also use it if I have a folder with lots of large files to copy. Then, if the copy gets interrupted it can be resumed.