The Borg
A very useful utility is the Borg Backup system, or just Borg. It’s a deduplicating backup system meaning that is scans the files and when it finds data that is already in the backup the data in the second and all other subsequent files are replaced with a reference to the first instance of that data.
The idea is that the same data is only stored once. All the backups you take after the initial one stores only the differences and the new data that has been accumulated since the last backup. This means that backing up after the initial backup is done is very fast, efficient, saves bandwidth and storage space.
Traditional backups have usually a full backup every month and then they take increments daily or so. If you need to restore a file you need to take the latest full backup, then apply each increment that was taken after. With the borg backup that is not necessary as you can view the file system exactly as it looked upon each and every backup point taken.
In fact you can mount the whole backup as a file system and then traverse it from there. It’s very effective. So let’s get started because face it, you don’t back up as much as you should do!
Borg can be used for multiple platforms but my commands here will be for linux.
The first step is to create a repository, this may sit on a different machine, NAS, attached USB drive or even on the same machine, of course you want multiple backups really so you can take the borg backup locally and then rsync it to as many locations as you feel is necessary.
Take the backup
The first step is to create the dir of the the backup repo and then we need to initialize it for being used with borg. This is quite simply done as:
$ sudo mkdir /bup
$ sudo borg init /bup
When the repository has thus been created it is time for the first initial backup. The format should be clear in a bit, it’s not complicated and can look like this:
$ sudo borg create --progress --info --stats /bup::lenovo-170202_163423 /home /root /boot /etc /var
The command above should be a single line. The first thing we give to borg is the command, in this case it is create to create a new backup set for us. Then we have some flags, –progress shows a progress indicator while borg is working that details also the number of bytes being read, backed, compressed and deduplicated. The next –info sets the information level borg presents to us and –stats lets borg summarize the operation with some statistics.
The next part of the command the /bup::lenovo-170202_163423 specifies the backup location and backup name. The name is given after the double colon :: mark. In this case its composed of the date yymmdd and time hhmmss of when the backup is started, doing that makes it easy to find the right set of data later when a restore is needed.
Why did I prefix it with lenovo? Well my main linux laptop is a lenovo and I also have other computers, like an ASUS laptop etc. The beauty with deduplicating backups is that I backup multiple machines to the same repo. By doing that it will deduplicate across the machines and if I have the same files and data in multiple places it will just be replaced by references to the data that is already in the backup.
The final part of the command is just all paths I want to include in this backup. They can vary from time to time. I might backup /home daily but /root only once a week if I want. No problem at all with borg.
Restoring a backup
No system of backup is actually deployed before you have attempted and successfully retrieved data from it so that you know what to do in an emergency as well as being able to extract old data mistakenly erased or restore a full system after a hard drive crash.
Restoring a borg backup works a little different from what you may be used to. First of all you can of course extract the data fully or just single files if you know their paths just like with any other backup system. The restore command is called extract in borg.
$ sudo borg extract /bup::lenovo-170202_163423
This will extract the entire archive and then you can move the files into their respective locations. You can also extract for example only the etc folder from the archive:
$ sudo borg extract /bup::lenovo-170202_163423 etc
Extraction always writes in the current working directory. Therefore you should first extract then move the files into their correct location in your file system or if all the backups are taken from the root of the file system / then you can cd there before extracting but I recommend extracting on a different volume first and then restoring from there. The reason is that there is usually a lot of stuff in a backup that you may not always want to restore.
Mounting the backup as a file system
So borg actually offers another way also. You can mount the backup as a volume, or you can mount the whole repo and see all the backup points made, select which one you want and then just copy the files from there to the live system.
$ sudo borg mount /bup::lenovo-170202_163423 /mnt
This will mount the backup lenovo-170202_163423 in the file system at /mnt. You can then cd to /mnt and then use cp etc to copy the files to their right places.
When done you can dismount it (otherwise other processes can’t backup, the repo is locked while mounted)
$ sudo umount /mnt
Borg uses fuserfs to mount local directories.
You may also mount the whole repository:
$ sudo borg mount /bup /mnt
Now when you go into the /mnt folder you will see all your backup names as directories:
$ ls
161204_040001 170101_203409 170113_040001 170117_040001 170121_040001 170125_010344 170128_030332
161206_040001 170108_040001 170114_040001 170118_040001 170122_214910 170125_040001 170128_040001
161218_174848 170111_040001 170115_040002 170119_040001 170123_040001 170126_040001 170129_040001
161225_040001 170112_040001 170116_040001 170120_040001 170124_040002 170127_040001 170201_082851
As you can see I generally name my backups with YYMMDD_HHMMSS just so it’s easy for me to find a specific date.
I can then cd to one of them
$ cd 170112_040001
$ ls
boot etc home root var vmlinuz vmlinuz.old
When done, don’t forget to unmount the archive as no new backups can be taken while it is mounted.
There you go. Start using.