1. How to speed-up builds with Omnibus caching

How to speed-up builds with Omnibus caching

To build a package, Omnibus compiles dependencies. If it’s a Python project, Omnibus will compile python each time it packages the project. If the project depends on a library – it will have to be recompiled every time. That’s suboptimal, let’s see what we can do to reduce the build time.

I was inspecting omnibus/omnibus.rb when noticed two relevant options: use_git_caching and use_s3_caching. It turns out the S3 and Git caching are not same thing and used differently by Omnibus. I found the Git caching more useful, but let’s talk about both of them.

S3 caching

There is a use_s3_caching boolean parameter in omnibus.rb. The scarce Omnibus documentation goes it is “Indicate if you wish to cache software artifacts in S3 for quicker build times”. By software artifact they mean an archive with the software source code. For example, it’s Python-2.7.14.tgz for the python software. So the purpose of this cache is to download Python-2.7.14.tgz not from internet as specified in the software definition, but from an S3 bucket which is supposed to be a) under your control b) faster than an Internet website. It feels like a questionable benefit, S3 is not that faster. There is a benefit of the caching anyway. The matter is the URLs change all time. Owners reorganize their website and Omnibus committers don’t keep up with reflecting the changes in the omnibus-software repository. Also, S3 is more reliable than the average website.

So I would say the S3 caching makes the Omnibus builds more reliable. It’s already good enough, let’s see how to configure and use it.

The configuration boils down to specifying the S3 bucket and AWS credentials. See an example in the TwinDB backup’s omnibus/omnibus.rb .

That’s not enough to use it. If the bucket is empty Omnibus will get a 404 error and the build will fail. The cache must contain all archives Omnibus might need for the build. A pretty straightforward way to do that is to populate the cache before the build. Again, see an example.

It won’t hurt to run it each time, it is slow only first time. If the bucket already has the necessary entry the bin/omnibus cache populate won’t spend time uploading it again.

That’s all with the S3 cache. Let’s see how the Git cache works and what it can do for us.

Git caching

The git cache makes the builds really faster because it caches not the source code like the S3 cache, but the compiled software. For example, it will cache the compiled python and Omnibus won’t have to recompile it again. Sounds cool, isn’t it? Let’s configure it.

The omnibus.rb suggests only a use_git_caching boolean parameter.

If the parameter is true, Omnibus will save compiled stuff in /var/cache/omnibus/cache/git_cache/opt/<project name>.

To make the git caching work it’s your responsibility to save the cache somewhere and make it available before the build process.

Sure if you build the packages on same machine every time you don’t have to care about the cache preserve the cache, but in the world of CI/CD and containers Omnibus builds on a new machine (or container for that matter) every time.

I build packages on Travis-CI so cannot do much about preserving the cache between builds, so before a build I pull the cache from an S3 bucket and save it to the bucket back after the build. Copying the cache back and forth still halves the build time as besides Python and libraries I have to compile XtraBackup. So here’re steps from omnibus/omnibus_build.sh:

Note, the cache is OS version specific.

Git configuration. At first, I thought Omnibus is going to save the cache in the same repo as the source code. Well, fortunately, it’s not going to happen. Omnibus commits the cache to a git bundle.

Put the cache to a directory where Omnibus expects it to be.

And after the build save it back to S3.

Comments, need help?

If you fancy Omnibus and fast builds or have questions, feel free to ask on TwinDB forum or drop an email to support@twindb.com for commercial support.

Previous Post Next Post