Backup script for Drupal using Drush and Cron

This is something I have been meaning to write up for a while: how to automate backups using cron and Drush, a commandline tool for Drupal. Drush makes creating backups of your Drupal website's database and files really easy, and I have written a script that calls Drush to create a backup, and then manages your existing backups so you don't use up too much space on your drive. Once a month, it will also create an encrypted copy of that day's backup file and email it to an external email address.

Personally, I think that daily backups from a month ago aren't useful to me any more - if I was going to revert to a backup from that long ago, I'd be looking to restore a weekly backup. Similarly, after several months, I'd only be interested in monthly backups. After creating your daily backup, the script runs through the other files in the backup directory and deletes the files you don't need any more. The script will keep:

  • One week of daily backups
  • One month of weekly backups (1st, 8th, 15th and 22nd)
  • Monthly backups for one year
  • Yearly backups for ever

Before I wrote the script, samhobbs.co.uk was a WordPress site hosted on a Raspberry Pi until the drive I was using bricked. I didn't have a recent backup so I lost the lot. Now I make backups to an external hard drive, so that if the SSD in the my Intel NUC gets corrupted I'll be able to recover. The encrypted copies sent to an external email address protect against the server being stolen or lost in a fire.

Installing Drush

If you haven't installed Drush, here's how, If you have, then you can skip to the next section.

Drush is in the repos, so installation is really simple:

sudo apt-get update
sudo apt-get install drush

Certain PHP functions are necessary for Drush to run, and I found that my default CLI php.ini was disabling these. To get things working again, open /etc/php5/cli/php.ini and comment this line:

#disable_functions = pcntl_alarm,pcntl_fork,pcntl_waitpid,pcntl_wait,pcntl_wifexited,pcntl_wifstopped,pcntl_wifsignaled,pcntl_wexitstatus,pcntl_wtermsig,pcntl_wstopsig,pcntl_signal,pcntl_signal_dispatch,pcntl_get_last_error,pcntl_strerror,pcntl_sigprocmask,pcntl_sigwaitinfo,pcntl_sigtimedwait,pcntl_exec,pcntl_getpriority,pcntl_setpriority,

Now check if Drush is working:

sudo drush status

You should get some output like this:

PHP configuration     :  /etc/php5/cli/php.ini 
Drush version         :  5.10.0                
Drush configuration   :

If so, all is well.

The Script

So, here's the script. It should be copied to /etc/cron.daily/website-backup, and edited so that BACKUP_DIR is the directory on your external hard drive where you'd like the backups to go, and DRUPAL_DIR is the path to Drupal's root. You also need to create a file in your home directory containing an encryption passphrase to pass to mcrypt, and edit ENCRYPTION_KEYWORDFILE to equal the full path to the file (don't use ~). It's probably wise to restrict the permissions to this file: chmod 600 /path/to/file. The file should be a text file with a single line containing your passphrase, which can be up to 512 characters long.

You also need to fill in EXTERNAL_EMAIL so that the script knows where to send your encrypted backup.

The emailing part of the script relies on you having a Mail Transfer Agent (MTA) installed on the server, e.g. Postfix. If this is not the case, take a look at my email server tutorial.

Most of the utilities in the script should already be installed, but you will probably have to install mutt (for sending the email) and mcrypt (for encrypting your file):

sudo apt-get update
sudo apt-get install mutt mcrypt
#! /bin/bash
# Backup script for www.samhobbs.co.uk

# USE FULL PATH
BACKUP_DIR="/media/backup/website/"
DRUPAL_DIR="/var/www/samhobbs/"
ENCRYPTION_KEYWORDFILE="/home/sam/.mcryptpasswordfile"

# External email address to send monthly encrypted backup files to
EXTERNAL_EMAIL="you@yourexternalemail.com"

# redirect errors and output to log file
exec 2>&1 1>>"${BACKUP_DIR}backup-log.txt"

NOW=$(date +"%Y-%m-%d")


# Headers for log
echo ""
echo "#==================================================== $NOW ====================================================#"
echo ""

# Back up Drupal with Drush
drush archive-dump default -r $DRUPAL_DIR --tar-options="-z" --destination=$BACKUP_DIR$NOW.tar.gz

# clean up old backup files
# we want to keep:
#               one week of daily backups
#               one month of weekly backups (1st, 8th, 15th and 22nd)
#               monthly backups for one year
#               yearly backups thereafter

# seconds since epoch (used for calculating file age)
SSE=$(date +%s)

FILES_LIST=( "$BACKUP_DIR"* )

for file in "${FILES_LIST[@]}"; do
  if [[ $file = *20[0-9][0-9]-[0-9][0-9]-[0-9][0-9].tar.gz ]]; then
    FILENAME=$(basename "$file")
    FILENAME_NO_EXTENSION=${FILENAME%%.*}
    FILE_YEAR=$(echo $FILENAME_NO_EXTENSION | cut -d'-' -f 1)
    FILE_MONTH=$(echo $FILENAME_NO_EXTENSION | cut -d'-' -f 2)
    FILE_DAY=$(echo $FILENAME_NO_EXTENSION | cut -d'-' -f 3)
    SSE_FILE=$(date -d "$FILE_YEAR$FILE_MONTH$FILE_DAY" +%s)
    AGE=$((($SSE - $SSE_FILE)/(24*60*60))) # age in days
    
    # if file is from the first day of a year (yearly backup), skip it
    if [[ $file = *20[0-9][0-9]-01-01.tar.gz ]]; then
      echo "file $file is a yearly backup: keeping"

    # if file is from the first day of a month (monthly backup) and age is less than 365 days, skip it
    elif [[ $file = *20[0-9][0-9]-[0-9][0-9]-01.tar.gz ]] && [ $AGE -lt 365 ]; then
      echo "file $file is a monthly backup, age < 1yr: keeping"

    # if day of month is 08, 15 or 22 (weekly backup) and age is less than 30 days, skip it
    elif [ $FILE_DAY -eq 08 -o $FILE_DAY -eq 15 -o $FILE_DAY -eq 22 ] && [ $AGE -lt 30 ]; then
      echo "file $file is a weekly backup, age < 30 days: keeping"

    # if age is less than seven days, skip it
    elif [ $AGE -lt 7 ]; then
      echo "file $file is a daily backup, age < 7 days: keeping"
    
    # if it hasn't matched one of the above, it should be deleted
    else
      echo "removing file $file"
      rm $file
    fi
  else
    echo "file $file does not match the expected pattern: skipping"
  fi
done

DAY=$(date +%d)

if [[ $DAY = 01 ]]; then
  echo "encrypting a copy of today's backup to send by email"
  # encrypt today's backup file using mcrypt
  mcrypt -F -f $ENCRYPTION_KEYWORDFILE $BACKUP_DIR$NOW.tar.gz
  
  # if the encryption is successful, email the file to an external email address
  if [[ -f $BACKUP_DIR$NOW.tar.gz.nc ]]; then
    echo "Monthly backup created $NOW, encrypted using mcrypt" | mutt -s "Monthly backup" -a $BACKUP_DIR$NOW.tar.gz.nc -- $EXTERNAL_EMAIL
    echo "Email sent, removing encrypted file"
    rm $BACKUP_DIR$NOW.tar.gz.nc
    echo "Done"
  else
    echo "Something went wrong with mcrypt: the encrypted file was not found"
    exit 1
  fi
fi

The line exec 2>&1 1>>"${BACKUP_DIR}backup-log.txt" redirects any errors and standard output to a log file in the backup directory. If you would like to receive an email of this output to your root user, comment the line.

Restoring a backup

Drush also makes restoring backups really easy. Here's how to resore a backup:

cd /var/www
sudo drush archive-restore --db-su=root --db-su-pw=YOURDATABASEROOTPW /path/to/backup.tar.gz

To do

  • This script works well for protecting against drive failure, but if someone literally walks into your home and takes your server you're still stuffed. I'd like to add a section that emails an encrypted backup to an email address of choice once a month, which would add more protection.
  • It's important to test your backups, but testing one on your actual server isn't ideal, because it overwrites your current installation. I'm planning on writing a quick tutorial for setting up a local-only version of Apache on a laptop for testing and development.
Type: 

Comments

Thanks for the great script!

I set it up on a server about a month ago and it worked great for a month, but then it started to produce files that didn't follow the naming scheme, here's an excerpt from the backup-log.txt:

file /var/backups/museums/2015-03-18.tar.gz is a daily backup, age < 7 days: keeping
file /var/backups/museums/2015-03-19.tar.gz is a daily backup, age < 7 days: keeping
file /var/backups/museums/backup-log.txt does not match the expected pattern: skipping
file /var/backups/museums/drupal_xx.20150320_062942.tar does not match the expected pattern: skipping
file /var/backups/museums/drupal_xx.20150321_062942.tar does not match the expected pattern: skipping
...

Have you seen this before or got any idea why it could be happening? On the 30th it managed to work as expected again, but not any other days since, and many of these days the drush status was not output to the log at all. Very much confused and doubt it is anything wrong with the script itself, could it be a lack of memory on the server or something like that?

That's very strange! I've not seen anything like that before (just checked my log to make sure).

I think you might be on the right track - I'd say something interrupted drush because it created a .tar but didn't gzip it. No idea what that could be though. What kind of hardware are you using...is it a pi or something else with very limited memory?

Sam

Add new comment