Glastree Nearline Backups
Several years ago a friend told me about pdumpfs which is a small Python script modeled after the backup system used by Plan 9. Not being a Python guy, I found a Perl script which has similar behavior called glastree. I use it along with rsync and a 1TB Iomega Network StorCenter Disk to do nearline backups of my systems on my home network.
I purchased the Iomega StorCenter Disk and NFS mounted it on my network server. It turned out this device under the covers is running Samba on top of a stripped down Linux kernel. It also claimed to fully support NFS, but the support is incomplete because the disk does not support separate uids or gids. However, this limitation did not pose an insurmountable barrier for using the device as a nearline backup solution. I mounted this disk on my server on the /backups directory. For backing up my systems, I created two directories per system I wish to backup:
- <name>
- <name>-backup
The name directory holds a current rsynced copy of the contents of selected directories on the target computer. We will see how this is populated shortly. The backups directory is a glastree copy of the name directory. This directory tree is organized by date. The toplevel directory is yyyymm and each subdirectory under it is the day of the month. Once a day (early in the AM) a cron job kicks off on the backup server which runs the following script:
#! /bin/sh
BACKUPDIR=/backup
HOSTS="ccm-notebook ccm-desktop smoot-notebook"
for host in $HOSTS; do
if cd $BACKUPDIR/$host; then
glastree . $BACKUPDIR/${host}-backups
else
echo "Cannot cd to $BACKUPDIR/$host"
fi
done
glastree takes two arguments. The first is the source directory to copy from and the second is the target directory tree. In this script we cd to the source tree and then perform the copy. If you do not cd to the source tree, glastree will add unnecessary directories in the target. Similar to pdumpfs, glastree will only update files which have changed. All unchanged files are simply hardlinked to the last file. For filesystems which change slowly, this saves a lot of diskspace.
Populating the source tree is done with rsync. On ccm-desktop the following rsync command is run daily, sometime before the preceeding backup script is run on the server:
#! /bin/sh PATH=/usr/local/bin:$PATH export PATH RSYNC_RSH=/usr/bin/ssh rsync -e $RSYNC_RSH --delete --exclude Shared --exclude "*.dmg" -S -rlptD - -progress /Users casa-pc.sanpedro.tic.com:/backup/ccm-desktop
The backup policies are encapsulated in the script. In this case ccm-desktop is an Apple Macintosh. I am only backing up the Users directory and also excluding any .dmg files (big disk images) and anything in the Shared folder. Since this job runs out of corn, it requires the root account to have public/private key authentication setup. Notice that the rsync o and g options are missing from the option list. This is because the Iomega device does not support uids and gids and omitting those options avoids a lot of warning messages in the cron output.
This simple system works very well for nearline backups and has saved me on many occasions from disaster when I have inadvertently deleted a file or needed to find an earlier copy of a file.
glastree can be found here. Included in the distribution is a prune script which can be used to prune the tree periodically.
- smoot's blog
- Login or register to post comments
