TOB (Tape Oriented Backukp)
This document details how to use the unix tool tob (tape-oriented backup) to do backups to the cloud (google drive specifically). In effect, we’re going to use google drive the way we used to use magnetic tapes for doing backups. Tob has been around for a long time and is a very reliable tool. It’s also entirely written in bash, so should work on most Unix/OSX systems.
Why use it?
If you're working in linux or want an off-site backup/alternative to Time Machine on a Mac, tob is an easy-to-use tool that makes archives (single tar files), which are the most efficient way to send data to google drive (avoids throttling when sending many files). It handles the book-keeping needed to do full, incremental and differential backups.
Installing tob
Begin by installing the required supplemental software in the system you want to use.
Required software for OSX/Darwin
- The GNU version of find. The version of find that comes with Darwin lacks many key features used by tob. Easiest way to do this is by installing homebrew, and then installing its findutils package.
-
You must install the Google Drive Stream on your Mac. You will need to be signed in to your Carnegie google account.
Required software for Linux
- The rclone tool (this may already be installed). You will have to set it up to work with your cloud server. Instructional video can be found here.
Download/install tob
Download tob from github:
$ cd src
$ git clone https://github.com/obscode/tob-ppa.git tob
Make a link to the tob script somewhere in your PATH and optionally link the man page to somewhere in your MANPATH. For example:
$ ln -s src/tob/tob $HOME/bin
$ sudo ln -s src/tob/tob.8 /usr/local/man/man8
Using a link makes it convenient if you ever update tob (by running git pull in the src/tob folder). Make a tob folder somewhere convenient (say your home folder) and make the folders tob needs:
$ cd $HOME
$ mkdir tob
$ mkdir tob/volumes
$ mkdir tob/lists
Copy one of the sample configuration files to this tob folder. For example, if you are on a linux machine:
$ cp src/tob/sample-rc/tob.rc.rclone tob/tob.rc
If you are on a Mac:
$ cp src/tob/sample-rc/tob.rc.osx+gdrive tob/tob.rc
Edit the tob.rc file and change TOBHOME to point to the tob folder you created and TOBLISTS to point to the lists folder. Also edit any other variables needed (see comments in the sample file).
For convenience, make an alias in your startup scripts (.bashrc, .tcshrc, etc) to make tob use your custom rc file:
alias tob='tob -rc $HOME/etc/tob.rc'
Check to make sure all your paths and settings are correct
$ tob -check
Ok, got resource /home/cburns/etc/tob/tob.rc.
Checking existence of directories.
Checking volumes in /home/cburns/etc/tob/volumes
Cleaning up.
Setting up volumes to be backed up
That's it for installing tob. Now we define what needs to be backed up. You can set up multiple volumes to be backed up (perhaps on different schedules). For example, to specify a volume called mydata, you create two files in $TOBHOME/volumes. The first file is called mydata.startdir and has one or multiple lines indicating the root(s) of the backup. For instance:
/Volumes/data1
/Volumes/data2
Tob will backup all files and folders under these two folders (you could specify more). The second file is named mydata.exclude. As you might imagine, this is how you specify files to skip when making the backup. You use grep patterns to make sure certain things are not backed up. For example, you might have:
^/Volumes/data1/scratch
\.core$
backup
Files and folders whose full path name matches a pattern will not be backed up. In the first example, the carrat (^) matches the beginning of the path, so everything below /Volumes/data1/scratch will be ignored. The second line matches any file with .core extension and the third line matches any file or folder than contains the string backup.
Using tob
That's it for setting things up. Now it's time to actually do the backups. There are three commands that you use:
- tob -full {volume}: Make a full backup volume named {volume}. This backs up everything (except patterns in the .exclude file). You must have a full backup of every volume.
- tob -diff {volume}: Make a differential backup of volume {volume}. This will backup every file that has changed or been added since the last full backup.
- tob -inc {volume}: Make an incremental backup of volume {volume}. This will backup every file that has changed or been added since the most recent full, differential, or incremental backup.
You'll typically want to use cron to have full backups created on some long timescale, and then differential and/or incremental backups on a more regular timescale. These timescales depend entirely on your usage. Here is an example:
$ crontab -l
MAILTO=cburns@carnegiescience.edu
TOB=/home/cburns/bin/tob -rc /home/cburns/etc/tob.rc
# Full backup the first day of every month at midnight
0 0 1 * * $TOB -full mydata
# differential backup every week on Sunday at 11:00pm
0 23 * * 0 $TOB -diff mydata
# incremental backup every day at 10:00 pm
0 22 * * * $TOB -inc mydata
This will make a full backup every month, a differential backup every week and incremental backup every day. For more information on setting up cron, read this page.
Here are some other useful commands:
- tob -volumes: list all defined volumes
- tob -backups: list all backups that have been done, their type, and date.
- tob -find {pattern} : find a file that matches {pattern} in the backups. Use this to determine where (which volume and which backup) a backed-up file exists.
- tob -restore {volume} [file pattern] : restore a file matching [file pattern] from a {volume}. If [file pattern] is omitted, restore the entire volume. Restores full path names relative to the current working directory.
Caveats
- If you are running tob on linux and using rclone to send to the cloud (see tob/sample-rc/tob.rc.rclone), you need to have enough temporary free space (scratch space) to create the tar archive before it is uploaded. If space is limited, you can try splitting your backup in to multiple volumes.
- If you are running tob on a Mac and don't want to use rclone as above, you should install the Google Drive file stream client. This makes your Google drive look like a remotely mounted file system. Tob can backup to this directly without having to create a tar archive locally first.
- If you are using rclone to do the backup, then tob cannot delete remote files automatically when they are no longer needed. So MAXBACKUPAGE will have no effect and you should just set it to -1.