Everyone would like all their work securely backed up, with every version of every file instantly accessible, but few backup solutions come anywhere close to this ideal. And making backups is just boring.
Many network backup systems resemble a 1970s mainframe more than a 21st-century PC: if you want something restored you have to grovel to the operators to do it because they back up to tape, and most users don’t have their own tape drives.
Most backup solutions are tailored either to a ‘disaster recovery’ scenario or to more local disasters such as accidentally deleting important project files
This leads many of us to back up the contents of our hard disks to a disk on another computer, and that machine could be anywhere, perhaps in another country, connected over the internet. Sending valuable data across the world also means we have to think about securing it using encryption.
Most backup solutions are tailored either to a “disaster recovery” scenario, in which your hardware fails and you need to re-create its whole configuration on a replacement machine; or to more local disasters such as accidentally deleting important project files.
Sometimes, though, you may take a backup for archival purposes, to record the whole state of your work environment on a particular day for posterity (or the VAT man). There are open-source products to cater for all three of these different roles, but it’s important to keep them distinct.
Taking a snapshot
“Snapshots” are one of the least understood and least used facilities offered by modern file systems. A snapshot is a consistent copy of an active file system that can be accessed separately from that file system. Some systems provide read-only snapshots that are optionally writable, while others are writable from the start.
My personal favourite is ZFS on Solaris, which takes a snapshot almost instantaneously and consumes very little extra disk space.
Any snapshot scheme imposes some space overhead, but while ZFS’s isn’t too bad, anecdotal evidence suggests that file system performance on Linux’s Logical Volume Manager (LVM) can degrade badly when there’s a snapshot present, and there are similar issues with Windows’ Volume Shadow Copy Service (VSS).
We’ve spent quite some time ensuring that the database management systems that run on those machines aren’t overly sensitive to the degraded disk performance
This is no reason to avoid snapshots, but something to be aware of. The Linux world has the btrfs file system to look forward to, which will provide similar snapshot performance to ZFS, as well as many other features. Btrfs is now part of the main Linux kernel, and we’ll try it soon and report our results.
We currently take snapshots of all our systems at 9am every morning and 7pm every evening, and on our ZFS-based systems we keep such snapshots for at least a month, meaning we store at least 60 complete snapshots at any time.
We’ve spent quite some time ensuring that the database management systems that run on those machines aren’t overly sensitive to the degraded disk performance, but we do routinely prune some snapshots.
We employ separate file systems for database tables and for the binary logs that are used for recovery and replication: for the binary logs we keep up to 30 days of snapshots, but for the database tables we keep around four months’ worth.