Brotlifying Nginx under NixOS

HTTP compression has been part of the standard since the late 1990s. The idea behind it is simple: to save bandwidth, compress the responses that a server sends. The two most common compression formats understood by browsers today are gzip and DEFLATE. Both of them employ the same compression algorithm, but vary in how they do checksumming and headers. Both of them have also been in the standard since the late 1990s.

Brotli is a lossless compression algorithm developed at Google, starting in the year 2013. Initially intended for use in the Web Open Font Format, its specification was published as RFC 7932 in 2016, and at about the same it started being available in mainline Web browsers as a generic HTTP compression format.

As Brotli tends to achieve better compression ratios than gzip, and is widely supported by browsers in use today, it may be a good idea for today's Web servers to support it, in addition to the classic gzip and DEFLATE encodings. NixOS's Nginx can do Brotli, but it requires enabling an extra module, as Brotli support is not built into the server normally.

The Nginx Brotli module

Nginx from Nixpkgs can be rebuilt with the Brotli module by using override:

nginxMainline.override {
  modules = nginxMainline.modules ++ [ nginxModules.brotli ]; 
}
Overriding nginxMainline to include the Brotli module

nginxMainline is Nginx from the mainline branch. Mainline is the Nginx branch which sees more active development, while the stable branch is closer to what in other places would be called a Long Term Support version—it sees mostly bug fixes, and no new feature development. In both cases, there are tagged releases, and both branches are considered suitable for use in production. nginx in Nixpkgs points to nginxStable, but it is fairly easy to switch between the two.

When overriding modules, we have to remember to include the original modules—nginxMainline.modules—and append brotli to them. Unlike with NixOS configs, there is no merging of lists, so if we were to set modules to just the brotli module, we would get rid of the defaults.

Chances are, this modified Nginx will not be in cache.nixos.org, and will need to be built locally. An exception is nginxStable with Brotli; this tends to be in cache, because a test incidentally builds it. The Discourse module in Nixpkgs enables Brotli with Nginx, and so when Discourse is tested, Nginx with Brotli has to be built, and subsequently ends up in cache. Fortunately, compiling Nginx is not a very resource intensive operation either way.

In NixOS configurations, an alternative to manually overriding modules is available: services.nginx.additionalModules. This setting internally does what we previously did: add our select modules to the default list of modules.

Adding the Brotli will, however, not add the relevant Brotli settings. The module's README contains an example configuration, so we can crib that instead.

{ config, pkgs, lib, ... }:
{
  services.nginx = {
    enable = true;
    package = pkgs.nginxMainline;
    additionalModules = [ pkgs.nginxModules.brotli ];
    recommendedGzipSettings = true;
    recommendedProxySettings = true;
    appendHttpConfig = ''
      brotli on;
      brotli_comp_level 6;
      brotli_static on;
      brotli_types application/atom+xml application/javascript application/json application/rss+xml
             application/vnd.ms-fontobject application/x-font-opentype application/x-font-truetype
             application/x-font-ttf application/x-javascript application/xhtml+xml application/xml
             font/eot font/opentype font/otf font/truetype image/svg+xml image/vnd.microsoft.icon
             image/x-icon image/x-win-bitmap text/css text/javascript text/plain text/xml;
    '';
  };
}
Fragment of a NixOS configuration setting up Nginx with Brotli

This sets up both serving of statically compressed assets with brotli_static on (more on this later), and dynamic compression of requests with brotli on. The compression level can be tweaked to trade speed and CPU usage for density. We generally do not want dynamic compression to be too intense, because the server spending more time to achieve higher compression means more time spent waiting by the clients. There is also a reasonable list of MIME types to compress. We do not want to compress things like already-compressed images, but on the other hand, mainstays of the Web like HTML and CSS do compress pretty well with Brotli.

This configuration enables Brotli on the whole server, but we could also enable Brotli on vhost or location level—just add brotli on; to the extraConfig there instead.

Precompressing assets

Dynamically compressing responses generated by a dynamic service is useful, but sometimes we also serve stuff that is entirely static. Since, in that case, we have the static files ahead of time, we can compress them ahead of time as well. As a bonus, in this situation we can also use higher compression levels—when building our website ahead of time, we do not have the same time constraints as an active Web server.

When brotli_static is set to on and Nginx is asked for a file, it will look look for a file with the same name as the requested one, but with .br suffixed, and serve that compressed file directly in response (if the browser can accept Brotli compression). This means that in order to serve precompressed files, we simply need to put a bunch of files with .br extensions next to our original files.

Modern fancy webapp build systems often have something that can be inserted into the build pipeline to generate such precompressed files. When dealing with packages in Nixpkgs, however, we might want to avoid patching the build scripts in each package to get it to emit .br files, and instead add them to the finished thing with a more generic solution. To this end, we can write a derivation that uses a simple Bash script and the brotli binary.

{ stdenvNoCC
, lib
, brotli }:
{ src
, fileExtensions ? [
  "html" "js" "css" "json" "txt" "ttf" "ico" "wasm"
]}:
let
  findQuery = lib.flatten (lib.intersperse "-o"
    (map (ext: [ "-iname" (lib.escapeShellArg "*.${ext}") ]) fileExtensions));
in stdenvNoCC.mkDerivation {
  inherit src;
  inherit (src) version;
  pname = "${src.pname}-brotlified";

  nativeBuildInputs = [
    brotli
  ];

  buildPhase = ''
    find . -type f \( \
      ${lib.concatStringsSep " " findQuery} \
      \) -print0 | xargs -0 -P $NIX_BUILD_CORES -I{} brotli -vZk {}
  '';

  installPhase = ''
    cp -r . $out
  '';
}

brotlify.nix, a function for Brotlifying existing static websites or webapps

Let's follow this from inside out, starting with buildPhase. First, we call find to locate files of the types that we want to compress, and pass their list to xargs, so that we can do parallel execution with -P (which cannot be done with find -exec).

The variable $NIX_BUILD_CORES is supplied by Nix, and is meant to be the number of concurrent jobs that should be executed during a build. It corresponds to the nix.conf setting cores (nix.buildCores in NixOS config), is meant to be the number you would pass to make -j, and can vary from build machine to build machine (hopefully your builds don't break reproducibility when parallelized). Nix actually has a second setting—max-jobs (nix.maxJobs in NixOS config)—which controls the number of build jobs (builds of individual derivations) the particular machine will run at the same time. Each build job can potentially run cores processes, so you could have max-jobs×cores processes running at the same time.

Our find call is supplied with a list of arguments from findQuery, which tells it which files to locate. What we are doing here is turning a list of extensions (like, say, [ "html" "css" "js" ]) into a series of -iname (case-insensitive name match) arguments separated by -o ("or"), which would turn our example into -iname '*.html' -o -iname '*.css' -o -iname 'js'. We enclose this all in parentheses (escaped, so they don't get eaten by bash), because otherwise find would interpret the query as (in pseudocode) (of type file and extension "html") or extension "css" instead of of type file and (extension "html" or extension "css").

Our derivation is two nested functions. This is so that we can get the inputs of the outer function supplied by callPackage, giving us the inner function that we can call directly, using it something like this:

{ config, pkgs, lib, ... }:
let 
  brotlify = pkgs.callPackage ./brotlify.nix { };
in {
  # …
  services.nginx.virtualHosts."element" = {
    serverName = "element.example.net";
    root = brotlify { src = pkgs.element-web; };
    extraConfig = ''
      brotli_static on;
    '';
  };
}
Example NixOS configuration, setting up Nginx to serve Brotlified Element (the Matrix client)

Layered Brotlification

In our previous brotlify.nix, we simply added some .br files to an existing webapp, and copied the whole thing to a new output. This is convenient, but sub-optimal when it comes to composability.

Consider, for example, that we may wish to provide .gz files in addition to .br files. Nginx can serve precompressed .gz files, using a module which is built into the Nginxes from Nixpkgs by default. While gzip compression is usually less resource intensive than Brotli compression, there is the Zopfli project (Google apparently likes to name their compression projects after bread), which tends to produce .gz files with better compression ratios than those achieved by GNU's gzip, at the cost of longer compression times.

We could create something like brotlify.nix that calls zopfli on all the files instead of brotli, and then chain them, like brotlify { src = (zopflify { src = element-web; }); }. We can reason that the functions commute—regardless of whether we apply the Brotlification or the Zopflification first, we still get the same result, as each function simply adds more files, leaves existing files unchanged, and will not try to compress each other's outputs. The problem is that Nix cannot reach that conclusion, and so brotlify { src = (zopflify { src = element-web; }); } and zopflify { src = (brotlify { src = element-web; }); } are not the same derivation. Suppose we added a couple more filter functions like this, and the amount of possible permutations becomes a problem.

A different solution is to make brotlify and zopflify output just their corresponding compressed files, and then overlay their outputs on top of the original webapp. In addition to nicer composability, we also receive the benefit of being able to apply the functions concurrently, while with the previous example, we would have had to wait for the first function to produce an output before applying the second function. Let's modify brotlify.nix for use with this pattern.

{ stdenvNoCC
, lib
, brotli
, element-web }:
{ src
, fileExtensions ? [
  "html" "js" "css" "json" "txt" "ttf" "ico" "wasm"
]}:
let
  findQuery = lib.flatten (lib.intersperse "-o"
    (map (ext: [ "-iname" (lib.escapeShellArg "*.${ext}") ]) fileExtensions));
in stdenvNoCC.mkDerivation {
  inherit src;
  inherit (src) version;
  pname = "${src.pname}-brotlified";

  nativeBuildInputs = [
    brotli
  ];

  buildPhase = ''
    find . -type f \( \
      ${lib.concatStringsSep " " findQuery} \
      \) -print0 | xargs -0 -P $NIX_BUILD_CORES -I{} brotli -vZ {}
  '';

  installPhase = ''
    find . -type f -iname '*.br' -exec install -m444 -D {} $out/{} \;
  '';
}
brotlify.nix, but this time it only outputs .br files, preserving directory structure

The only change is essentially that we now only copy *.br files to $out, preserving the same directory structure (which is where install -D comes in handy). We can conceptualize the corresponding zopflify version as being essentially the same, except calling zopfli instead of brotli and exporting *.gz files instead of *.br files.

To compose all of this together, we use the buildEnv function in Nixpkgs. buildEnv combines several directory trees together using symlinks, which is handy for all sorts of things, like "installing" several end-user packages into one environment. It can also be used to layer our compressed overlays over the base package. With the default configuration, Nginx will follow symlinks, so a directory tree full of symlinks is fine to use as root.

{ config, pkgs, lib, ... }:
let 
  brotlify = pkgs.callPackage ./brotlify.nix { };
  zopflify = pkgs.callPackage ./zopflify.nix { };
  elementComposed = pkgs.buildEnv {
    name = "element-composed";
    paths = [
      pkgs.element-web
      (brotlify { src = pkgs.element-web; })
      (zopflify { src = pkgs.element-web; })
    ];
  };
in {
  # …
  services.nginx.virtualHosts."element" = {
    serverName = "element.example.net";
    root = elementComposed;

    extraConfig = ''
      brotli_static on;
      gzip_static on;
    '';
  };
}
NixOS configuration using the new brotlify and zopflify functions, combined using buildEnv

The nixos-rebuild script, when asked to switch to this configuration, will build both the Brotlified and Zopflified versions, possibly concurrently.

We can verify that the server is indeed serving precompressed responses by issuing a request with curl: curl --raw https://element.example.net/olm_legacy.js -H 'accept-encoding: br', and possibly comparing it against the precompressed version on disk. If you need to find the store path of the composed tree on a running system, you can do systemctl cat nginx to find out the location of the Nginx config file, and then inside the config file, look for the root directive that was emitted under the relevant vhost.

Further notes

The above derivations could use some further improvements. For one, they indiscriminately compress everything, including very small files. Compressing very small files can result in compressed files that are actually larger than the original, which is why both the Brotli module and the built-in gzip module by default do not apply compression to small responses; the default the threshold is 20 bytes for both. Still, it would be prudent to examine how much each file was actually compressed, and prune those which do not show sufficient improvement over uncompressed sizes.

Another thing to note is that combining compression and encryption opens the possibility of compression oracle attacks. These attacks rely on the fact that plain text with repeating strings will compress to a shorter length than text where those strings do not repeat. Consider a webpage that displays both a secret token and some user input. If both the user input and the token are the same, the page will compress to a shorter length than if they are different. An attacker who can repeatedly try different user inputs can, therefore, check the length of the response to see if their guess was correct, even if they cannot actually decrypt the response. This is the basic principle behind the BREACH attack.

Nevertheless, compression of static, non-secret assets is generally safe.

Further reading