Internet is full of simple shell backup scripts. Every DBA and system administrators (including myself) has written one. Bash is a bad tool for this job and in this post I will explain why.
Let’s find out why the need to write anything arises at first place. There is a lot of great backup tools for databases as well as for files. Following Unix paradigm they do one thing and do it well. With tar I can copy files, with mysqldump I can copy MySQL database. But there are other problems to solve. How to do scheduling (although it’s the easiest one), encryption (both transfers and at rest), compression, backups retention, work with storage. Surely, I’ve missed many more smaller but important details. That’s exactly a reason why people write wrappers on top of their favorite backup tools.
Don’t you think it’s too much for a shell script?
At the beginning I also needed something quick and dirty to backup our databases and files. In one evening I wrote a simple script (!) for MySQL and for files. It was a bash script, its config was also a bash piece. I put it in cron and was content for the moment.
Then I tried to reuse the script for our customers. The script wasn’t good for them in that state. The databases were bigger, they needed S3 storage, incremental backups and other features. That was a moment when the simplicity fired back.
First of all, the shell script has zero test coverage. There were neither unit not integration tests. If you add a smallest change there is no way to guarantee that something else won’t break. Same with bug fixes. If you fix a bug once there is no way to make sure the bug won’t appear again. I’m not aware of any unit test frameworks for Bash, never seen anybody using them. That’s why Bash is a good tool for stuff like “do step 1; do step 2; do step 3; exit”. Anything more complex than that is a reckless driving without fastened seatbelt – if you skilled enough you’ll get to point B, but if anything goes wrong, consequences are catastrophic.
Another problem with Bash is lack of means to work with data. Everything that can be elegantly organized in Python classes becomes spaghetti code in Bash. For any complex algorithm it’s vitally important.
Error handling in Bash is limited and tricky. The best what it has is
set -o errexit . And I hope you learned about
set -o pipefail not the hard way like I did :).
Instead of conclusion I want to tell that vast majority of our data recovery cases has similar story. There was a home baked script that wasn’t running/silently failing/buggy. When the backup copy became needed the script failed to deliver.
That happens because backup is not a simple problem and it cannot be solved by a simple script.
I’m writing this post looking at a tower (Torre Grossa) built in 14th century. It wouldn’t survive time if ancient masters didn’t follow best practices.