Getting into tape backups (Part 2)
In Part 1, we had a quick introduction to tapes and tape drives and why you would choose one for your backups. In this part, we talk about actually using tapes to create a backup strategy using simple scripting.
Of course, there are plenty of readily available tools that might suit your needs. Writing your own does have a few advantages though: first, it keeps it simple, highly customized and second, when things go wrong, you would probably have a better idea of why the damn thing isn't working!
A look at tape's workings
If this is your first time dealing with tapes(as it was for me), there's a few prerequisites.
Tools
Tape operations are carried out via the mt tool. Data is written with tar.
Both of these are probably already installed on your system.
Writing data to tapes
Do you remember the good old days of cassette players ? Tapes are similar. A magnetic head reads and writes data from a magnetic ribbon spooled in an enclosure. With that in mind, there are a few operations that you would do frequently:
rewind: rewinds(of course!) the tape and points the tape head to the beginning of the magnetic ribbon.
Example command: mt -f /dev/nst0 rewindforward: Move forward count files. Every time data is written, a marker is set at the end. Let's say, you write some data to the beginning of the tape: tar cvf /dev/nst0 backupdir
Rewind the tape as above. Now, you want to move forward one marker: mt -f /dev/nst0 fsf 1erase: That's a really slow process and could take hours (if not days) for larger tapes. However, you can also do a short erase: mt -f /dev/nst0 erase 0.
Tar and incremental backups
tar has a handy feature that lets you do incremental backups and the workings are really simple.
Let's look at an example:
tar -C /home —listed-incremental=diff.snar -clpMvf /dev/nst0 data
This is what we call a Level 0 backup. diff.snar is special – it contains a log of all the files that were added to the archive.Next, lets's say you add file.txt to folder data and run the above command again. The only file that would be added to the archive is file.txt. Moreover, diff.snar would also be overwritten with the only one entry that was just added to the archive. This would be a Level 1 archive.
Obviously, if you would want to have a record of all the backups, you wouldn't want to overwrite diff.snar but have rather something like this:
- diff0.snar: level0 backup
- diff1.snar: level1 backup
and so on...
Backup Strategy
With all this quick preliminary information, we can try a incremental backup strategy as follows:
- Maintain two sets of full backup tapes and two sets of incremental backup tapes.
- Create a full backup the start of every cycle: could be a month, bimonthly, quarterly or whatever you prefer.
- Until the beginning of next cycle, perform incremental backups.
- At any point of time, you should always have a backup set that has a full backup of the last cycle as well as incremental backup tape(s) of the last cycle.
Tape utility script illustrates the idea. To perform a full backup, you would run something like:
tapeutility.sh -d /dev/nst0 -F -p /etc/tapeutility/folders.txt where “-F” does a full backup of folders listed in “folders.txt”.
For the next run, to create an incremental backup, you would run:
tapeutility.sh -d /dev/nst0 -I -p /etc/tapeutility/folders.txt where “-I” does an incremental backup.
Please take a look at the script for how the metadata file is determined for incremental backup and other features available for basic tape maintenance.
Wrap-up
I presented a simple way to use tapes for backups. Using a combination of full and incremental backups, and maintaining two sets of tapes, we have reliable backup of data that you could combine with a RAID style setup for long term reliable data storage.