Embedding fonts in SVGs

2023 July 23, 02:31

As previously established, SVGs are nice. Being a vector format, they display well at varied sizes. Being a vector format, they also have to be rasterized by a renderer before display, which introduces an opportunity for rendering inconsistencies between different platforms.

One possible source of such rendering inconsistencies is fonts. Just like with HTML, the basic use of a font requires that the font be available to the rendering program locally; if the font is not installed, the renderer may fall back to a different font. This is particularly undesirable if, for example, a piece of text is supposed to fit within a box: if the dimensions of the text change, it could possibly end up going over the box's borders. Therefore, if we want such SVGs to always look correct, we have to figure out how to work around this limitation.

Converting fonts to paths

Font glyphs (at least in a TrueType font) are defined as contours consisting of curves and lines. SVGs can also do paths that consist of curves and lines. It follows that text could be turned into SVG paths.

Properly rendering Unicode text can be difficult, so we could ask an existing font rendering library to typeset our text properly, give it to us in a vector format, and then turn that entire bit of text into paths. This is how several tools do it, including Inkscape, which can turn text into paths via the Object ▸ Object to Path command.

The disadvantage of converting the entire text object into paths in this way is that each glyph rendered will be a separate path. Even when a character occurs within the text multiple times, each occurrence will be stored as a separate path in the SVG file. How much of a problem this is depends on how much text there is, so it can get annoying if there is a lot of it. In theory, it should be possible to list individual glyphs in <defs> of an SVG image, and then <use> them repeatedly, but the difficulty with that is the need to hook into a different point of the text rendering path.

There are other downsides to this approach: when rendered as a document, text in an SVG is selectable, but text-rendered-as-paths is not. Inkscape inserts the original text into an aria-label, so at least accessibility tools can still read it.

Conversion of text to SVG paths is also lossy. On displays without a high pixel density (we have to assume not everyone is using fancy new high density displays, at least not all the time), font rendering can involve application of hints, that the font file provides for aligning the glyphs to the pixel grid. The font rendering system can also do things like subpixel rendering. As rendering SVG paths usually does not involve such steps, text-as-text and text-as-paths can end up rendered differently:

The word "beep" three times, twice in black, and the third time appearing mostly as a pixelated outline. — The word "beep" was inserted into a nominally 1000×1000 px SVG image twice, with the first one staying as text, and the second one turned to path by Inkscape. The image was then rendered in Firefox at 100×100 px. The third "beep" is the difference between the first two. The typeface is Bitter.

Nevertheless, this approach is expedient, and works fairy well for, say, text that is part of a logo-like design. But, perhaps we can find a different solution...

Fetching fonts

CSS-as-used-in-HTML supports the @font-face at-rule, which can be used to tell the browser where to download a font. The font can then be used as if it was installed locally—the browser will download it when it needs to, and use it to render text in the document. A rather minimal example looks something like this:

@font-face {
  font-family: "Comic Neue";
  font-style: normal;
  font-weight: normal;
  src: url("/ComicNeue-Regular.woff2");
}

body {
  font-family: "Comic Neue", sans-serif;
};

A stylesheet declaring a @font-face rule, which tells the browser that the Comic Neue font in normal weight and style is available for download under a relative URL. The browser may consult this rule when it encounters Comic Neue in the font-family property for body.

@font-face is also available with CSS-as-used-in-SVG, but there is an obstacle to using it when showing SVGs in the browser: when used as images, SVGs cannot download external resources.

There are several ways to use SVGs with HTML documents, like <img> elements, <object> elements, inlining, or use as CSS images. The uses fall into two broad categories: one where the SVG is treated like an image, and one where the SVG is treated like an XML document. When an SVG is included via say, an <object> element, it is treated like an XML document, and so it can do stuff like download external resources or run embedded JavaScript. On the other hand, when an SVG is included via an <img> element, or used for background via CSS, it is treated like an image, and restricted from accessing external resources, running scripts, or providing interactivity—it acts like a raster image would. As such, use of external resources in @font-face limits how we can use our SVGs.

As an aside, these limitations mean that viewing an untrusted SVG included via an <img> tag—such as on a website that allows user-uploaded media—will not result in immediate disaster. When it is viewed this way, scripts inside the SVG will not run at all. However, SVGs viewed in a tab by themselves do run as XML documents. With the user-uploaded media example, an attacker could thus link to the SVG file directly in order to get someone to run malicious scripts in the context of the domain where the SVG was uploaded. Furthermore, the browser may expose an Open Image in New Tab option in the right click menu for an SVG in <img>, which provides a way for the user to wander into running the SVG's scripts.

Embedding fonts

The CSS url() function can accept data URLs. As a data URL does not require accessing an external resource, we can use it to include a font in a way that will work even when the SVG is included via an <img> element. As a bonus, we end up with an SVG that is self-contained.

In order to put our font file in a data: URL, we have to encode it with base64, which introduces 33% of overhead when it comes to file size. Most modern browsers support WOFF2—a compressed font format designed specifically for transferring fonts to browsers—so we can offset the space requirement somewhat. It would look something like this:

<svg
  viewBox="0 0 100 100"
  xmlns="http://www.w3.org/2000/svg"
>
  <style>
    @font-face {
      font-family: "Comic Neue";
      font-style: normal;
      font-weight: normal;
      src: url("data:font/woff2;base64,d09GMgABAAAAAFfQAA8…"); /* full, long base64 omitted */
    }
    text {
      font-family: "Comic Neue", sans-serif;
      font-size: 20px;
    }
  </style>
  <text x="0" y="50">beep boop</text>
</svg>

An SVG embedding a font in it, and using it to render "beep boop".

Subsetting fonts

The problem with embedding a whole font in our SVG file is that we are embedding the whole font. The font file may include glyphs which are not going to be used in rendering the image at all. If the text we are including only ever makes use of a subset of all the available glyphs in the font, we could drop all the unused glyphs, to save on space. This technique is commonly used with PDFs (which can also embed fonts), but we can also bring it to SVGs.

The fontTools package contains a subset module, which can be used for producing subsets of fonts. Properly producing a font subset can be tricky, as we have to make sure we do not omit any of the data required to render the desired text (which can include more than just the obvious glyphs), but fortunately the subsetting tool's default settings work well here. fontTools can also output a compressed WOFF2 file, although this requires fontTools with the woff extra. It works like this:

$ pyftsubset --text='beep boop' --output-file=ComicNeue-Regular-subset.woff2 --flavor=woff2 ComicNeue-Regular.otf
$ base64 -w0 ComicNeue-Regular-subset.woff2
d09GMk9UVE8AAAS4AAwAAAAAB3QAAARsAAIAxAAAAA…

Subsetting a font. The full font has been, once again, omitted.

Extracting the text from the SVG image can be the tricky part. It is easy enough to specify "beep boop" on the command line, if that's all our image contains. pyftsubset can also read the text from a plain text file, but we have to actually extract the text from the SVG first. As a quick hack, we could just copy and paste the text into a file, and feed that to pyftsubset. More properly, the solution would involve walking the XML tree, finding the text elements, copying their contents, and dumping that out. This process has been automated, with tools like svgoptim, which I have not, however, successfully tested (and the author of which has a blog post much like this one, but better). In the end, the result is something like this:

The words "beep boop" — An SVG image, using embedded subset of a font to render text

Which method to use

For short headline bits of text, or text incorporated into designs like logos, turning it into SVG paths is generally an expedient way to get decent results. For things like multiple longer diagram labels, or stretches of body text, embedding a subsetted WOFF2 file offers a portable SVG minimized filesize requirements, which is usable in <img> elements. There is also the option of not using any of these techniques, if the design can accept the possibility of substitute fonts being used, and that possibility is not objectionable.

Further stuff

Comic Neue – The font used in the examples above