Performance Corner: Faster Builds, Smaller Packages
Performance Corner is a new series where we highlight to you some changes in Serpent OS that may not be obvious, but
show a real improvement. Performance is a broad term that also covers efficiency, so things like making files smaller,
making things faster or reducing power consumption. In general things that are unquestionably improvements with little
or no downside. While the technical details may be of interest to some, the main purpose is to highlight the real
benefit to users and/or developers that will make using Serpent OS a more enjoyable experience. Show me the numbers!
Here we focus on a few performance changes
Ikey has been working on to the build process that are showing some pretty
awesome results! If you end up doing any source builds, you'll be thankful for these improvements. Special thanks to
ermo for the research into hash algorithms and enabling our ELF processing.
Start From the Beginning
When measuring changes, it's always important to know where you're starting from. Here are some results from a recent
glibc build, but before these latest changes were incorporated.
Payload: Layout [Records: 5441 Compression: Zstd, Savings: 83.13%, Size: 673.46 KB] Payload: Index [Records: 2550 Compression: Zstd, Savings: 55.08%, Size: 247.35 KB] Payload: Content [Records: 2550 Compression: Zstd, Savings: 81.46%, Size: 236.72 MB] ==> 'BuildState.Build' finished [4 minutes, 6 secs, 136 ms, 464 μs, and 7 hnsecs] ==> 'BuildState.Analyse' finished [21 secs, 235 ms, 300 μs, and 2 hnsecs] ==> 'BuildState.ProducePackages' finished [25 secs, 624 ms, 996 μs, and 8 hnsecs]
The build time is a little high, but a lot of that is due to a slow compiler on the host machine. But analysing and producing packages was also taking a lot longer than it needed to.
The Death of moss-jobs in boulder
In testing an equivalent build outside of
boulder, the build stages were about 5% faster. Testing under
jobs system was a bit excessive for the needs of
boulder, polling for work when we already know the times when
parallel jobs would be useful. Removing
moss-jobs allowed for simpler codepaths using multiprocessing techniques from
the core language. This work is integrated in
moss-deps and the excess overhead of the build has now been eliminated.
Before: ==> 'BuildState.Build' finished [4 minutes, 6 secs, 136 ms, 464 μs, and 7 hnsecs] ==> 'BuildState.Analyse' finished [21 secs, 235 ms, 300 μs, and 2 hnsecs] After: [Build] Finished: 3 minutes, 53 secs, 386 ms, 306 μs, and 4 hnsecs [Analyse] Finished: 8 secs, 136 ms, 22 μs, and 8 hnsecs
The new results reflect a 26s reduction in the overall build time. But only 13s of this relates to the
removal. The other major change is making the analyse stage parallel in
moss-deps (a key part of why we wanted
parallelism to begin with). Decreasing the time from 21.2s to 8.1s is a great achievement despite it doing more work as
we've also added ELF scanning for dependency information in-between these results.
New Hashing Algorithm
One of the unique features in
moss is using hashes for file names which allows full deduplication within packages,
the running system, previous system roots and for source builds with
boulder. Initially this was hooked up using
sha256, but it was proving to be a bit of a slowdown.
xxhash, the hash algorithm by Yann Collet for use in fast decompression software such as
now in many places!). This is seriously fast, with the potential to produce hashes faster than RAM can feed the CPU. The
hash is merely used as a unique identifier in the context of deduplication, not a cryptographic verification of origin.
XXH3_128bit has been chosen due to it having an almost zero probability of a collision across 10s of millions of
The benefit is actually two-fold. First of all, the hash length is halved from
sha256, so there's savings in the
package metadata. This shouldn't be understated as hash data is generally not as compressible as typical text and there
are packages with a lot of files! Here the metadata for the Layout and Index payloads has reduced by 232KB! That's about
a 25% reduction with no other changes.
Before: Payload: Layout [Records: 5441 Compression: Zstd, Savings: 83.13%, Size: 673.46 KB] Payload: Index [Records: 2550 Compression: Zstd, Savings: 55.08%, Size: 247.35 KB] After: Payload: Layout [Records: 5441 Compression: Zstd, Savings: 86.66%, Size: 522.97 KB] Payload: Index [Records: 2550 Compression: Zstd, Savings: 60.02%, Size: 165.75 KB]
Compressed this turns out to be about a 89KB reduction in the package size. For larger packages, this probably doesn't mean much but could help a lot more with delta packages. For deltas, we will be including the full metadata of the Layout and Index payloads, so the difference will be more significant there.
The other benefit of course is the speed and the numbers speak for themselves! A further 6.4s reduction in build time removing most of the delay at the end of the build for the final package. This will also improve speeds for caching or validating a package.
Before: ==> 'BuildState.Analyse' finished [21 secs, 235 ms, 300 μs, and 2 hnsecs] After: [Analyse] Finished: 1 sec, 688 ms, 681 μs, and 8 hnsecs
With these changes combined, building packages can take 12x less time in the analyse stage, while reducing the size of the metadata and the overall package. We do expect the analyse time to increase in future as we add more dependency types, debug handling and stripping, but with the integrated parallel model, we can minimize the increase in time.
We're Not Done Yet
The first installment of
Performance Corner shows some great wins to the Serpent OS tools and architecture. This is
just the beginning and there will likely be a follow up soon (you may have also noticed that it takes too long to make
the packages), and there's a couple more tweaks to further decrease the size of the metadata. Kudos to
getting these implemented!