TwinDB Really Loves Backups
A week or two ago, one of my former colleagues at Percona Jevin Real gave a talk titled Evolving Backups Strategy, Deploying pyxbackup at Percona Live 2015 in Amsterdam. I think Jervin raised some very good points about where MySQL backup solutions in general fall short. There are definitely a lot of tools and scripts out there that claim to do MySQL backups correctly, but actually don’t. However, what I am more interested in is measuring TwinDB against the points that Jervin highlighted to see if TwinDB falls short too.
We distribute the TwinDB agent as a package that can be installed using the standard OS package management system. For example, using YUM on CentOS, RHEL and Amazon Linux, or using APT on Debian and Ubuntu. Since the package is installed using standard packaging systems, any dependencies are resolved in the standard way. The packages are publicly available on the TwinDB repository:
- https://packagecloud.io/twindb/main for CentOS, RHEL, Debian and Ubuntu
- https://packagecloud.io/twindb/amzn_main for Amazon Linux
Note that we had to package the TwinDB agent separately for Amazon Linux, because since version 2015.03 it is kind of a special case where it can neither be called RHEL 6 nor a RHEL 7. One issue we had was with the perl-DBD-MySQL package, which is a dependency of Percona XtraBackup package. We had to repackage perl-DBD-MySQL, details on why and how can be found in one of our earlier blog posts Xtrabackup and MySQL 5.6 on Amazon instance.
Furthermore, we have tried to clearly define the instructions on how to install the agent.
The TwinDB agent itself is written in Python. Python as a language is portable and runs on many unix variants as mentioned at General Python FAQ. What we also did to further test portability, was building a Continuous Integration/Continuous Deployment pipeline to run automated integration tests on all the platforms that we support. Below is a list of the platforms that we support:
- CentOS 6
- CentOS 7
- Ubuntu Precise
- Ubuntu Trusty
- Debian Wheezy
- Debian Jessie
- Amazon 2014.09
- Amazon 2015.03
Rest assured that we take portability very seriously, and nothing gets rolled out without being automatically tested.
I will be publishing a separate post detailing how we implemented the Continuous Integration/Continuous Deployment pipeline.
A Tool For Non-DBAs
The main reason why we developed the TwinDB agent was to simplify the backup process. We didn’t want to build something that was difficult to use, or that would require the user to be a DBA. That’s why, in most cases the user doesn’t need a configuration. Things like backup schedule and retention policy already have sane defaults which work for most of the cases. A configuration can be tied to a group of MySQL servers or a single MySQL server, and if you want to change the backup schedule it is very easy to change it from the GUI.
You don’t need toto login into the MySQL server with SSH tool/protocol to configure the TwinDB agent in any way. The agent is also smart enough to detect replication topology and chooses an appropriate slave to take backups. If the replication topology changes, the agent can detect that and would automatically choose a different slave to take backups.
No Babysitting Needed
As you would have already noticed from the description above, in most cases the agent is autonomous. The only times you would really need to interact with it is, for example, through the GUI, if you have to make configuration changes, or if you need to restore backups. And then you have TwinDB support itself whenever you need it and wherever you need it.
Remote Streaming Of Backups
The TwinDB agent stores the backup on a dedicated storage space, and not on the same server that runs MySQL. You don’t even have to manage the storage space. TwinDB manages the storage space and provides you with two options:
- TwinDB-hosted scalable and secure storage in the cloud
- On-premises secure hosting on your hardware, be it baremetal or your private cloud
There are many reasons why you would backup a MySQL server. But the most important reason is for you to be able to restore the backup as and when needed. We at TwinDB, understand that and have simplified the restore process so that you can do it with just one click.
Off-site backups have traditionally been a starting point for data protection. However, there are risks involved at multiple stages: transporting the backups and storing them. Therefore, from our perspective encryption of data-in-flight and encryption of data-at-rest is equally important – it enables both safe transport and safe storage of backed-up data. The next logical question would be: How exactly does the encryption process go? TwinDB uses asymmetric encryption using OpenGPG. The GPG key is auto-generated on a MySQL server that is being backed up by the agent. Users own the GPG private key. Furthermore, the encryption is always on. So whether you use the TwinDB agent as an off-site backup solution or if you backup to an on-premises storage space, the backups are always encrypted.
Full and Incremental Backups
The TwinDB agent supports both full and incremental backups. You don’t need an additional configuration to enable incremental backups. The GUI allows you to configure the backup schedule as shown in one of the screenshots above. You can configure how often full and incremental backups taken.
We are working on a feature which would allow in-flight backup validation. Our extensive experience with the InnoDB Data Recovery Tool and with data recoveries itself gives us a head-start on how we want to implement backup validation. In-flight backup validation will provide immediate feedback in contrast to backup validation techniques generally used right now. This is all the more beneficial for users with large datasets.
At the moment though, our backup validation strategy also revolves around regular backup restores.
Reading through all that I have written, I hope you get a better picture of where the TwinDB agent stacks up against all the other backup solutions. Comparing it to other points my former colleague Jervin mentioned, I certainly see it very clearly now. I think TwinDB is a perfect example of a smoothly working backup solution that is easy to implement.