{"version":"https://jsonfeed.org/version/1.1","title":"Web log on dee.underscore.world","home_page_url":"https://dee.underscore.world/blog/","feed_url":"https://dee.underscore.world/blog/feed.json","icon":"https://dee.underscore.world/favicon.png","authors":[{"name":"Dee","url":"https://dee.underscore.world"}],"language":"en-US","items":[{"id":"https://dee.underscore.world/blog/firefox-reader-view-fonts/","url":"https://dee.underscore.world/blog/firefox-reader-view-fonts/","title":"Customizing Firefox Reader View fonts","date_published":"2024-08-29T17:33:19.000Z","date_modified":"2024-08-29T17:33:19.000Z","content_html":"<p>One of the neat features of Firefox is its Reader View. The Reader View removes everything but the main content of the page, and displays it in a consistent, clean way. Firefox will display an icon in the address bar if Reader View is available for a page, allowing the user to toggle it on and off.</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/toggle-reader-mode.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/toggle-reader-mode.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/toggle-reader-mode.png\" alt=\"The reader view button, an icon of a page with text on it, at the right end of the address bar\">\n</picture>\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/reader-view-default.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/reader-view-default.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/reader-view-default.png\" alt=\"Firefox Reader View of this article. It is a sparse view of the article with a grayish background, and light, large sans serif text using generous line spacing. There is a minor mise en abyme effect with this screenshot.\">\n</picture>\n<figcaption>Firefox Reader View button, and the Reader View that it toggles the viewport into.</figcaption>\n</figure>\n<p>Reader View has some settings, including ones for fonts. Unfortunately, these use hard-coded font stacks, which neither respect browser-level fonts settings, nor allow for easily selecting arbitrary fonts.</p>\n<p>Previously, various sources on the Internet have suggested using Firefox's user profile customization stylesheets (like <code>userChrome.css</code> and <code>userContent.css</code>) for customizing this, but recent upstream changes have broken this approach. Fortunately, there is a simpler way.</p>\n<h2 id=\"adding-extra-fonts\">Adding extra fonts</h2>\n<p>Extra fonts can be added by adding them to the array stored under the <code>reader.font_type.values</code> user preference.</p>\n<p>To change it, enter <code>about:config</code> in the address bar, acknowledge the caution screen if needed, and enter <code>reader.font_type.values</code> in the search box at the top of the page. The default values for this preference are <code>\t[&quot;sans-serif&quot;,&quot;serif&quot;,&quot;monospace&quot;]</code>, but any extra explicitly named fonts can also be added here. For example, we can do <code>[&quot;sans-serif&quot;,&quot;serif&quot;,&quot;monospace&quot;,&quot;Comic Neue&quot;]</code> to add <a href=\"https://github.com/crozynski/comicneue\" title=\"Comic Neue repository\">Comic Neue</a>.</p>\n<p>Anything other than <code>sans-serif</code>, <code>serif</code>, or <code>monospace</code> entered here is used for the <code>font-famiy</code> CSS property. More specifically, it is <a href=\"https://searchfox.org/mozilla-central/rev/261005fcc4d6f8b64189946958211259fb45e9e1/toolkit/components/reader/AboutReader.sys.mjs#1072-1093\" title=\"AboutReader.sys.mjs on Searchfox\">used for a CSS variable that will be used for <code>font-family</code></a>. The value will be quoted, so we can't put a whole font stack in here.</p>\n<p>Either way, it works:</p>\n<figure class=\"figwide\" id=\"figure-2\">\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/comic-neue.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/comic-neue.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/firefox-reader-view-fonts/comic-neue.png\" alt=\"This article in Reader View, showing Comic Neue in the selection dropdown, and also showing the article in Comic Neue.\">\n</picture>\n<figcaption>Reader mode, with Comic Neue (and also <a href=\"https://github.com/solmatas/BitterPro\" title=\"BitterPro repository\">Bitter</a>) added to font selection (in all seriousness, using a distinctive font in a situation like this makes it easier to notice if it's being set properly).</figcaption>\n</figure>\n<h2 id=\"pocket\">Pocket</h2>\n<p>As an aside, the reason the Pocket icon is present in one screenshot, but absent from another here is that in my actual Firefox profile (as opposed to a separate blank one with default settings), I have Pocket disabled. You can disable Pocket with the <code>extensions.pocket.enabled</code> preference.</p>\n<h2 id=\"further-links\">Further links</h2>\n<ul>\n<li><a href=\"https://bugzilla.mozilla.org/show_bug.cgi?id=1880656\">Mozilla Bugzilla bug Bug 1880656</a> – Issue relevant to the changes which broke the old stylesheet customization approach.</li>\n</ul>\n"},{"id":"https://dee.underscore.world/blog/git-as-a-database-kind-of/","url":"https://dee.underscore.world/blog/git-as-a-database-kind-of/","title":"Git as a database, kind of","date_published":"2024-06-14T00:51:31.000Z","date_modified":"2024-06-17T14:39:31.000Z","content_html":"<p>I run NixOS on multiple machines. I store configuration for all of them in a Git monorepo, and deploy to them via some bespoke scripts (despite there being multiple mature deployment tools out there for the purpose).  One problem I have with this setup is that I do not keep close track of what revision my hosts are on. Not all of the machines are on at all times, which means that some cannot be simply ssh'd into to see, and if they have not been turned on in a while, they may be more out of date than others.</p>\n<p>One way to solve this would be to have the update scripts record every update action in some out-of-band log. Another way would be to have Git branches which track the revision that each machine is on, similar to how the channel branches in Nixpkgs keep track of the channel state.</p>\n<p>A less reasonable way would be to use log files that are tracked by Git. A script could easily modify a JSON file kept in the main branch of the repo, but keeping it there would lead to a lot of noise in the commit log. A better solution would be a separate, detached branch. There actually is precedent for keeping data in a separate Git branch, with the likes of <a href=\"https://github.com/MichaelMure/git-bug\" title=\"git-bug repository\">git-bug</a> or <a href=\"https://git-annex.branchable.com/\" title=\"git-annex website\">git-annex</a>!</p>\n<p>Keeping a whole working tree checked out for the purpose of tracking the JSON branch would get annoying, however. A neater solution would be to access Git at a lower level, generating commits without checking anything out. While there are libraries for a variety of programming languages that can be used for this sort of lower-level access, it can also be done using various <code>git</code> commands, and so it is possible to do from a shell script. So, let's do that, using Nushell to make things fun.</p>\n<h2 id=\"git-internals\">Git internals</h2>\n<p>To accomplish our task, we will first need a quick recap (or perhaps a rapid introduction) of Git internals.</p>\n<p>A central piece of how Git works is the content-addressable store. The store stores objects, addressed by their SHA-1 hashes (or SHA-256, when using that particular experimental feature). The object store exists inside the <code>.git/</code> directory of a repository, but the internal implementation details are not particularly relevant, as Git provides abstracted access to the object store, even for lower level use.</p>\n<p>There are several types of objects that can be stored. Relevant here are <i>blobs</i>, <i>trees</i>, and <i>commits</i>. A <dfn>blob</dfn> is the simplest—it is the contents of a file that has been put into Git. A blob object does not contain the file's metadata, like the filename or modification date, just the contents.</p>\n<p>A <dfn>tree</dfn> is a directory tree. It is, essentially, a list of paths (names), each mapping to a hash address of either a blob, or another tree. Having a name point to the hash of blob allows us to represent a file; having a name point to another tree is the way to represent a subdirectory. It is possible for multiple different trees in the store to point to the same file (blob) or subdirectory (tree).</p>\n<p>Lastly, a <dfn>commit</dfn> is, well, a Git commit. The object contains some metadata, like the author and committer (which can be two different people in Git), date, and the commit message. The object also contains the hash of a tree; the tree records how the tracked directory looked at commit time.</p>\n<p>Additionally, the commit object can also contain hashes of its parent commits. The root commits (often there is just one) in a commit history will have no parents, an ordinary commit will have one parent, and commits representing merges have two or more parents. All together, this looks something like this:</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<img src=\"https://dee.underscore.world/blog/git-as-a-database-kind-of/git-diagram.svg\" alt=\"Chart of an arrangement of a commit object, a tree object, and two blob objects, depicting relationships as described in article.\">\n<figcaption>An example commit, and the objects it references. The commit has no parents, so it would be a root (first) commit in the commit history graph. In reality, tree and commit objects contain more metadata than this.</figcaption>\n</figure>\n<p>So, to insert a commit we need to first put some blobs in the store. Then, we can use the hashes of those blobs to write a tree to the store, which will represent the state we want to commit. Then, we need to write a commit that points to that tree, possibly indicating some other commit as its parent.</p>\n<h2 id=\"first-commit\">First commit</h2>\n<p>For a simple start, let's make a new branch with a single commit, that contains a single JSON file. Assuming we are starting out with some Nushell structured data, the first step is to turn it into JSON. Then, we can add it to the repo's object store with <a href=\"https://git-scm.com/docs/git-hash-object\" title=\"git-hash-object in Git documentation\"><code>git hash-object</code></a>. <code>git hash-object</code> only hashes an object, unless we give it <code>-w</code>, in which case it will also actually write the object to the store.</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > git init\nInitialized empty Git repository in /wherever/.git/\n┃ > [{\n┇     date: \"2024-06-01T00:00:00Z\",\n┇     revision: 1,\n┇     result: \"fire\"\n┇   }] | to json -i 2 | git hash-object --stdin -t blob -w\nc8718e86195539c5ab6d85e9b019056f0d80587d\n</span></code></pre>\n<figcaption>Adding some structured data to the object store, and getting a hash back.</figcaption>\n</figure>\n<p>We explicitly specified we want our JSON to be indented by two spaces (as opposed to minified). The indented form makes plain text diffs a bit nicer, and while two spaces is the default, specifying it explicitly helps ensure consistency if that ever changes.</p>\n<p><code>git hash-object</code> received the JSON via standard input, and gave us back a hash. We can check that the file has, indeed, been put in the object store by asking Git to retrieve the blob that is under that hash:</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > git cat-file blob c8718e86195539c5ab6d85e9b019056f0d80587d\n[\n  {\n    \"date\": \"2024-06-01T00:00:00Z\",\n    \"revision\": 1,\n    \"result\": \"fire\"\n  }\n]\n┃ > git cat-file blob c8718e86195539c5ab6d85e9b019056f0d80587d | from json\n+---+----------------------+----------+--------+\n| # |         date         | revision | result |\n| 0 | 2024-06-01T00:00:00Z |        1 | fire   |\n+---+----------------------+----------+--------+\n</span></code></pre>\n<figcaption>We can get the JSON back, and we can turn it back into structured data.</figcaption>\n</figure>\n<p>Next, we need to construct a tree object. This can be done with <a href=\"https://git-scm.com/docs/git-mktree\" title=\"git-mktree in Git documentation\"><code>git mktree</code></a>.</p>\n<p><code>git mktree</code> expects a listing of files in a specific format. This format is the same one that <a href=\"https://git-scm.com/docs/git-mktree\" title=\"git-ls-tree in Git documentation\"><code>git ls-tree</code></a> outputs. Our case is very simple, since we only have a single file, so we can write it by hand:</p>\n<figure class=\"figwide\" id=\"listing-3\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > \"100644 blob c8718e86195539c5ab6d85e9b019056f0d80587d\\tsome_computer.json\\u{0}\" | git mktree -z\n7f29b01e020c2b9ae6c82bbf1d8a8513a8a75e80\n</span></code></pre>\n<figcaption>Manually-written, single entry for a single file, passed to <code>git mktree</code>.</figcaption>\n</figure>\n<p>The more obvious things in that string: <code>blob</code> identifies the entry as pointing to a blob (recall that we could also point at a tree), <code>c8718…</code> is the hash we got earlier, then a tab character, followed by the filename—<code>some_computer.json</code>, which we came up with—and a terminating null byte (since we passed <code>-z</code> to <code>git mktree</code>, to signal we are using null bytes).</p>\n<p>But, there is also that <code>100644</code>. In short, this means regular file, with permissions set to 0644 (i.e., equivalent to <code>-rw-r--r--</code>). Curiously enough, modern Git can only store permissions of 0644 and 0744 (the latter being equivalent of <code>-rwxr--r--</code>), so for normal files we can generally go with <code>100644</code> without thinking about it too much. One place this mode field is described is the <a href=\"https://git-scm.com/docs/index-format\">documentation on the index format</a>.</p>\n<p><code>git mktree</code> also gave us a hash for the tree. This will be useful, since next we will be making a commit with <a href=\"https://git-scm.com/docs/git-commit-tree\" title=\"git-commit-tree in Git documentation\"><code>git commit-tree</code></a>:</p>\n<figure class=\"figwide\" id=\"listing-4\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > git commit-tree -m 'add some_computer.json' 7f29b01e020c2b9ae6c82bbf1d8a8513a8a75e80\n4ac389255b9c56833577fd114c696eed52ecc647\n</span></code></pre>\n<figcaption>Creating a commit that points at the tree we made.</figcaption>\n</figure>\n<p>And now we have a commit! Its hash was printed by <code>git commit-tree</code>. The commit uses our configured name and email address, as well as the current timestamp, so it will be different on every run, unlike the previous hashes from the examples.</p>\n<p>The commit is eligible for being garbage collected, since it is just floating in the store, unconnected to anything. It might be useful to point a branch at it using <a href=\"https://git-scm.com/docs/git-update-ref\" title=\"git-update-ref in Git documentation\"><code>git update-ref</code></a>:</p>\n<figure class=\"figwide\" id=\"listing-5\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > git update-ref -m 'create data branch manually' refs/heads/data 4ac389255b9c56833577fd114c696eed52ecc647\n┃ > git branch -v\n  data 4ac3892 add some_computer.json\n</span></code></pre>\n<figcaption>Making a branch to point at our new commit.</figcaption>\n</figure>\n<p><code>git update-ref</code> can be used to drop a reference anywhere in the <code>.git</code> directory. It so happens that Git keeps references that it sees as branches under <code>refs/heads</code>, so if we put a reference there, it will behave just as if we made a branch the regular way. We <em>could</em> use <code>git update-ref</code> to put a reference somewhere else under <code>refs/</code>. git-bug, for example, uses <code>refs/bugs/</code> to keep references to bugs. These other references do not show up as branches, and various tools may or may not ignore them. However, none of the references under <code>refs/</code> get garbage collected. <a href=\"https://git-scm.com/docs/git-gc\" title=\"git-gc in Git documentation\"><code>git gc</code></a> will keep anything it finds under <code>refs/</code>, even if it is a type of reference Git does not know about (like git-bug's bugs).</p>\n<p>Note, also, that we provided a message to <code>update-ref</code> via <code>-m</code>. This is <em>not</em> a commit message—it is a log message, stored under <code>.git/logs/</code>, which will show up if we <code>git reflog</code> our new branch. Git will normally produce reflog entries that tell us why a reference was updated, like a commit, a rebase, or a reset. We can provide our own message to note that the reference was updated by us, manually. Reference logs are not synced to remote.</p>\n<h2 id=\"updating-the-data\">Updating the data</h2>\n<p>A useful thing to be able to do now is to update the existing data. We can use Git's <code>revision:file</code> syntax to find the current version of the JSON file, deserialize it into Nushell, and then manipulate it as needed:</p>\n<figure class=\"figwide\" id=\"listing-6\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > let new_data = (git cat-file blob data:some_computer.json | \n┇     from json |\n┇     append { \n┇       date: \"2024-06-01T00:45:00Z\", \n┇       revision: 5, \n┇       result: \"minor explosion\"\n┇     })\n┃ > $new_data\n+---+----------------------+----------+-----------------+\n| # |         date         | revision |     result      |\n| 0 | 2024-06-01T00:00:00Z |        1 | fire            |\n| 1 | 2024-06-01T00:45:00Z |        5 | minor explosion |\n+---+----------------------+----------+-----------------+\n┃ > let new_object = ($new_data | to json -i 2 | git hash-object -t blob --stdin -w)\n┃ > $new_object \n33277d56cc235d7ec79a4f83bc9fba8dfe6226af\n</span></code></pre>\n<figcaption>Retrieving previously committed list, appending an entry to it, turning it back into JSON, and putting that new version in the object store.</figcaption>\n</figure>\n<p>Let us assume, however, that we have more than one file tracked in the branch now. This means we cannot simply write a new tree with only our new file, as that would discard the other files.</p>\n<figure class=\"figwide\" id=\"listing-7\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > git ls-tree data\n100644 blob d624c7294af935e8a00acfc03126c1789b5f5df5    another_computer.json\n100644 blob c8718e86195539c5ab6d85e9b019056f0d80587d    some_computer.json\n</span></code></pre>\n<figcaption><code>git ls-tree</code> shows us that we have two files now.</figcaption>\n</figure>\n<p>Nushell's ability to modify structured data can come in handy here as well. The tree objects are, essentially, tables, and Nushell is good at modifying tables. So, what we can do is take the tree from the current commit, parse it into Nushell structured data, and modify it with our new file.</p>\n<figure class=\"figwide\" id=\"listing-8\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > let prev_tree = (git ls-tree -z data |\n┇     split row \"\\u{0}\" |\n┇     compact --empty |\n┇     each {|it| $it |\n┇       parse --regex '(?&lt;objectmode>\\d+) (?&lt;objecttype>\\w+) (?&lt;objectname>[0-9a-f]+)\\t(?P&lt;path>[^\\x00/]+)'} |\n┇     flatten)\n┃ > $prev_tree \n+---+------------+------------+------------------------------------------+-----------------------+\n| # | objectmode | objecttype |                objectname                |         path          |\n| 0 | 100644     | blob       | d624c7294af935e8a00acfc03126c1789b5f5df5 | another_computer.json |\n| 1 | 100644     | blob       | c8718e86195539c5ab6d85e9b019056f0d80587d | some_computer.json    |\n+---+------------+------------+------------------------------------------+-----------------------+\n┃ > let new_tree = (\n┇     $prev_tree |\n┇     where {|it| $it.path != 'some_computer.json'} |\n┇     append { \n┇       objectmode: \"100644\",\n┇       objecttype: \"blob\",\n┇       objectname: $new_object,\n┇       path: 'some_computer.json'\n┇     })\n┃ > $new_tree \n+---+------------+------------+------------------------------------------+-----------------------+\n| # | objectmode | objecttype |                objectname                |         path          |\n| 0 | 100644     | blob       | d624c7294af935e8a00acfc03126c1789b5f5df5 | another_computer.json |\n| 1 | 100644     | blob       | 33277d56cc235d7ec79a4f83bc9fba8dfe6226af | some_computer.json    |\n+---+------------+------------+------------------------------------------+-----------------------+\n</span></code></pre>\n<figcaption>Parsing the output of <code>git ls-tree</code> into structured data, and then modifying it.</figcaption>\n</figure>\n<p>The column names here are the terminology that Git uses, which can be a bit confusing: <code>objectname</code> means hash, and <code>path</code> means name.</p>\n<p>We have updated our structured data by dropping the old version of the file from the list and adding the new version back in. Now we can turn that structured data back into a string that <code>git mktree</code> will understand, and do the rest of our commit process as before.</p>\n<figure class=\"figwide\" id=\"listing-9\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token output\">┃ > $new_tree |\n┇     each {|it| $\"($it.objectmode) ($it.objecttype) ($it.objectname)\\t($it.path)\\u{0}\"} |\n┇     str join |\n┇     git mktree -z\ne1e6b041d361bcc94bed76cbd987c06c1d93ef49\n┃ > git commit-tree -m 'update some_computer.json' -p data e1e6b041d361bcc94bed76cbd987c06c1d93ef49\n05024814c4708130bb1071ba9e6a4e854ca70090\n┃ > git update-ref -m 'update data' refs/heads/data 05024814c4708130bb1071ba9e6a4e854ca7009\n┃ > git log --pretty=oneline data\n05024814c4708130bb1071ba9e6a4e854ca70090 (data) update some_computer.json\n4b38d584d066a952b298612ff38f5590f19b57ce add another_computer.json\n4ac389255b9c56833577fd114c696eed52ecc647 add some_computer.json\n</span></code></pre>\n<figcaption>Putting our new tree in the object store, then using it to make a commit, then updating the branch to point to that commit.</figcaption>\n</figure>\n<p>Note that, this time, <code>git commit-tree</code> was provided with a parent through <code>-p data</code>. Git will dereference the <code>data</code> branch here, and record the commit's parent as whatever <code>data</code> is pointing to at the moment. This way, our branch will have a commit history, instead of just one commit.</p>\n<h2 id=\"putting-it-together\">Putting it together</h2>\n<p>With this approach, we now have a way of storing arbitrary Nushell structured data in a Git branch, with the branch being maintained without being explicitly checked out in Git.</p>\n<p>I wrote <a href=\"https://git.underscore.world/d/wugdb\">a Nushell module called wugdb</a>  for this purpose. I do not, however, encourage its serious use in production. While this sort of use of Git is viable, a script that puts JSON files into Git branches is not the most robust way of doing a database. Nevertheless, Git is a tool that can be used for unconventional and interesting stuff of this sort, and there <em>are</em> examples of more mature projects that do so, like the aforementioned git-bug and git-annex. Git also makes lower-level tooling available right from its command line interface, so it is possible to experiment with these sorts of things from a shell, without writing more elaborate code.</p>\n<h2 id=\"further-reading\">Further reading</h2>\n<ul>\n<li>&quot;<a href=\"https://git-scm.com/book/en/v2/Git-Internals-Git-Objects\">Git Internals - Git Objects</a>&quot;, from <cite>Pro Git</cite> – an overview of Git internals, with some examples of manual object manipulation, from the <cite>Pro Git</cite> book (which is available for reading on the Git website).</li>\n<li>&quot;<a href=\"https://github.com/MichaelMure/git-bug/blob/master/doc/model.md\">git-bug's reusable entity data model</a>&quot; – overview of git-bug's more complex model for storing data using Git objects and commit DAGs.</li>\n</ul>\n<h2>Changes since this article was published</h2><ol><li><b>2024-06-17T14:39:31.000Z</b> – The article previously indicated that Git commits store the name and password of the user. This is not true, as they store the name and <em>email address</em>. This has been corrected.</li></ol>"},{"id":"https://dee.underscore.world/blog/running-ksp-under-nixos/","url":"https://dee.underscore.world/blog/running-ksp-under-nixos/","title":"Running Kerbal Space Program under NixOS","date_published":"2024-04-03T22:37:24.000Z","date_modified":"2024-04-03T22:37:24.000Z","content_html":"<p>Recently, I decided to play <cite>Kerbal Space Program</cite> (<cite>KSP</cite>, the first one) again. It runs under Linux natively, so one would assume it should be fine under NixOS.</p>\n<p>It <em>does</em> mostly work fine under NixOS, with a caveat. When I last played <cite>KSP</cite> years ago, I had a large number of mods installed. Naturally, when I decided to play it again, one of the first things I did was add a large number of mods. While the game itself ran with no problems, there was something decidedly <em>off-nominal</em> about the mods:</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/running-ksp-under-nixos/mechjeb.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/running-ksp-under-nixos/mechjeb.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/running-ksp-under-nixos/mechjeb.png\" loading=\"lazy\" width=\"414\" height=\"260\" alt=\"A semi-transparent widget with a bunch of buttons arranged in two columns. There is obvious space for the widget's title, and space next to the buttons for labels, but that space is empty and there is no text.\">\n</picture>\n<figcaption>This MechJeb widget should have text in it.</figcaption>\n</figure>\n<h2 id=\"solution\">Solution</h2>\n<p>Turns out, something involved in the technology stack that <cite>KSP</cite> mods end up using makes some assumptions about fonts under Linux. Namely, the assumption is that that certain fonts—fonts usually available on any Windows system—will be available under <code>/usr/share/fonts</code>. The solution is to run <cite>KSP</cite> in an environment where these fonts are available. One way to do so is to add these fonts to <code>steam-run</code>, and use that to run the game:</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">(</span>steam<span class=\"token punctuation\">.</span>override <span class=\"token punctuation\">{</span>\n  extraPkgs <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span>pkgs<span class=\"token punctuation\">:</span> <span class=\"token punctuation\">[</span>\n    pkgs<span class=\"token punctuation\">.</span>corefonts\n    pkgs<span class=\"token punctuation\">.</span>vistafonts\n  <span class=\"token punctuation\">]</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span> \n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span>run<span class=\"token punctuation\">;</span></code></pre>\n<figcaption>Overriding <code>steam-run</code> to add Microsoft's <code>corefonts</code> to it. For a good measure, we also add <code>vistafonts</code>.</figcaption>\n</figure>\n<p>This package with override can be added to something like <code>home.packages</code> in Home Manager, in order to put the modified <code>steam-run</code> in <code>PATH</code>. <cite>KSP</cite> can then be launched with <code>steam-run ./KSP.x86_64</code>.</p>\n<p>If, for whatever reason, we want to leave the original <code>steam-run</code> untouched, we could also, for example, symlink the modified version under a new name, like <code>steam-run-ksp</code>:</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span> steam<span class=\"token punctuation\">,</span> runCommandLocal <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span>\n<span class=\"token keyword\">let</span>\n  steam<span class=\"token operator\">-</span>run <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span>steam<span class=\"token punctuation\">.</span>override <span class=\"token punctuation\">{</span>\n    extraPkgs <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span>pkgs<span class=\"token punctuation\">:</span> <span class=\"token punctuation\">[</span> pkgs<span class=\"token punctuation\">.</span>corefonts pkgs<span class=\"token punctuation\">.</span>vistafonts <span class=\"token punctuation\">]</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span>run<span class=\"token punctuation\">;</span>\n<span class=\"token keyword\">in</span> runCommandLocal <span class=\"token string\">\"steam-run-ksp\"</span> <span class=\"token punctuation\">{</span> <span class=\"token punctuation\">}</span> <span class=\"token string\">''\n  mkdir -p $out/bin\n  ln -s <span class=\"token interpolation\"><span class=\"token antiquotation important\">$</span><span class=\"token punctuation\">{</span>steam<span class=\"token operator\">-</span>run<span class=\"token punctuation\">}</span></span>/bin/steam-run $out/bin/steam-run-ksp\n''</span></code></pre>\n<figcaption>A package (that can be <code>callPackage</code>'d) that will give us a <code>bin/steam-run-ksp</code> that can be used to run <cite>KSP</cite>.</figcaption>\n</figure>\n<p>If launching the Steam version of <cite>KSP</cite> from within Steam itself is preferable, we can also add the fonts to the Steam package. Using NixOS-level Steam configuration, we can override its package using the relevant option:</p>\n<figure class=\"figwide\" id=\"listing-3\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token comment\"># configuration.nix</span>\n<span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> pkgs<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span> <span class=\"token punctuation\">{</span>\n  programs<span class=\"token punctuation\">.</span>steam <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    enable <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n    package <span class=\"token operator\">=</span> pkgs<span class=\"token punctuation\">.</span>steam<span class=\"token punctuation\">.</span>override <span class=\"token punctuation\">{</span>\n      extraPkgs <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span>pkgs<span class=\"token punctuation\">:</span> <span class=\"token punctuation\">[</span> pkgs<span class=\"token punctuation\">.</span>corefonts pkgs<span class=\"token punctuation\">.</span>vistafonts <span class=\"token punctuation\">]</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption>Using the NixOS Steam module with the Steam package modified with extra fonts.</figcaption>\n</figure>\n<h2 id=\"the-why\">The why</h2>\n<p>One of the notable things about Nix is that it does not use the Filesystem Hierarchy Standard (FHS). The FHS is the specification that gives us all the paths commonly found on your garden variety Linux distro, like <code>/usr/share</code> or <code>/usr/lib</code>.</p>\n<p>Nix's deviation from the usual way of doing things is evident in, among other things, dynamic linking. A usual Linux system, when starting a dynamically linked executable, will look for the relevant libraries in a place like <code>/usr/lib</code>. Packages under Nixpkgs, on the other hand, will emit executables configured in such a way that, on launch, the libraries used will be from specific paths in the Nix store. Packages will also sometimes need to be configured or patched to address other assumptions about the FHS, like when a package wants to look for image assets under <code>/usr/share</code> .</p>\n<p>Note that this is <em>Nix and Nixpkgs</em> in general, not <em>NixOS</em> specifically. NixOS notably does not use the FHS either, as it is built from those non-FHS packages, and so does not need to have all the usual FHS paths at all.</p>\n<p>There is a problem with this setup: proprietary software distributed in binary form. While it is generally possible to patch anything we build ourselves to work with Nix's (and NixOS's) way of doing things, patching an opaque binary is more difficult. While such binaries can often be convinced to use a dynamic linker that will give them the right libraries from the Nix store, for more complicated situations there is <code>buildFHSEnv</code>.</p>\n<p><code>buildFHSEnv</code> is a Nixpkgs builder that provides a way to execute arbitrary binaries in what will look to that binary like an ordinary FHS environment. The FHS environment directory tree—itself a Nix store path—is constructed by copying or symlinking from the relevant paths in the Nix store. Then, <a href=\"https://github.com/containers/bubblewrap\" title=\"bubblewrap repository\">bubblewrap</a> is used to execute the opaque binary in a namespace, where that environment is used to form the root directory tree (<code>/</code>). While bubblewrap is a sandboxing tool, the way it is used with <code>buildFHSEnv</code> is not for the security-oriented kind of sandboxing; rather it is a convenient way to use namespaces and bind mounts, on the user level.</p>\n<p>The Steam package in Nixpkgs runs using <code>buildFHSEnv</code>. The FHS environment used by it includes a bunch of libraries useful for running video games under Linux, which means that games installed via Steam often just work. This FHS environment happens to be handy for running non-Steam games as well, which is why the Nixpgks Steam package exports <code>steam-run</code>. <code>steam-run</code> allows running arbitrary binaries—not just Steam—in the same environment that Steam runs in.</p>\n<p>Years ago, <cite>Kerbal Space Program</cite> apparently used to have a problem with expecting fonts to be in <code>/usr/share/fonts</code> specifically, at least according to <a href=\"https://bugs.kerbalspaceprogram.com/issues/1151\" title=\"KSP bug tracker bug #1151\">an issue</a> on the <cite>KSP</cite> bug tracker. This issue does not seem to affect <cite>KSP</cite> itself today, but something similar seems to happen to mods. While a large portion of <cite>KSP</cite> mods are open source, I am not familar enough with <cite>KSP</cite> or Unity development to tell if this is a recurring problem commonly caused by mods themselves, or something in a different place in the stack. Nevertheless, things seem to work fine when the usual Windows fonts are available under <code>/usr/share/fonts</code>.</p>\n<p>Nixpkgs includes both <code>corefonts</code>, which contains such classics as Arial, as well as <code>vistafonts</code>, which contains newer common fonts, like Calibri. Adding both of these packages to a <code>buildFHSEnv</code> environment makes them available under the expected <code>/usr/share/fonts</code> locations. The Nixpkgs Steam package has an <code>extraPkgs</code> attr which can be used to insert extra packages into its FHS environment. So, adding <code>corefonts</code> and <code>vistafonts</code> there provides a quick and expedient solution to the problem.</p>\n"},{"id":"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/","url":"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/","title":"Y2K, as seen on TV, as seen now","date_published":"2024-01-26T21:29:31.000Z","date_modified":"2024-01-26T21:29:31.000Z","content_html":"<p>HBO recently released a documentary called <cite>Time Bomb Y2K</cite>. Several places—<a href=\"https://www.imdb.com/title/tt26786629/\" title=\"The film's IMDb page\">including IMDb</a>—offer a short blurb about the film as such:</p>\n<blockquote>\n<p>An immersive, all-archival retelling of the &quot;Y2K&quot; millennium bug and the mass hysteria that changed the fabric of modern society.</p>\n</blockquote>\n<p>In some corners of the Internet, this summary caused the particular reaction that seems to have become common whenever the year 2000 problem is mentioned. Although the 1990s were in the ancient times of over 24 years ago, there are still people alive who were alive back then. Some of those people were also, at the time, involved in year 2000 problem mitigations. Some of those people take issue with framing of the problem as &quot;mass hysteria&quot;, pointing out that the issue was actually real, and it was work like theirs that ensured the issue was addressed before it became a more serious problem.</p>\n<p>Indeed, this seems to be the trajectory the year 2000 problem has taken in popular perceptions. Previously, people—either those who remembered that time, or those who learned about it later—often associated it with the prevalence of irrational fears of impending literal apocalypse. Recently, though, it is more common to see people vehemently arguing on the other end of the spectrum, and defending the position that the year 2000 problem was actually very serious</p>\n<p>Blurb notwithstanding, the film actually does <em>not</em> attempt to make the point that the year 2000 problem was a lie. Nevertheless, the broader shifting of sentiment about the year 2000 problem, as manifested in the reaction to the film, does make sense if we consider how serious, global problems are talked about today. A useful way to approach the film, then, is with an eye on how the 1990s and the 2020s differ in how they deal with their problems.</p>\n<h2 id=\"the-film\">The film</h2>\n<p>Directed by Marley McDonald and Brian Becker, and first shown at a film festival in early 2023, <cite>Time Bomb Y2K</cite> consists entirely of contemporary footage from the 1990s, with no extra narration. The film splices news reports, snippets of documentaries, assorted B-roll, less-professionally published tapes, and even home movies into a chronological narrative.</p>\n<p>The film starts in the mid-1990s, and moves forward from there, with larger portions of the film devoted to the times closer to the year 2000. It tries to convey the general mood of the times, or at least the general mood of the times as seen on American television. It also includes some coverage of more fringe elements of society, and their reaction to the year 2000 problem and its possible consequences. At the end, we find out that (spoiler alert) civilization did not end on January 1st, 2000, and get to watch the people of the year 2000 express their relief and joy.</p>\n<p>At times, the film is cheeky, referencing our current times in the same way that a prequel from a popular franchise may reference later canon. Elon Musk, Jeff Bezos, and Osama bin Laden show up at various points. At one point, someone mentions an example of a car that may refuse to start after the year 2000, because it believes it has not been serviced for a hundred years, which seems directed at the people of 2023 and their cars that refuse to start due to a failed over-the-air software update.</p>\n<h2 id=\"a-1990s-crisis-from-the-2024-perspective\">A 1990s crisis from the 2024 perspective</h2>\n<p>For inhabitants of 2024 who are looking back at the year 2000 problem, the COVID-19 pandemic and the climate crisis both come to mind. All three are examples of global problems, which require considerable efforts to address. All three are things that people could have been better prepared for earlier on.</p>\n<p>It is these comparisons that are likely why current day discussions of the year 2000 problem often involve strong condemnations of seeing the problem as exaggerated. The hypothetical individual who is the target of condemnation here is one who believes that the climate crisis and the COVID-19 pandemic are also exaggerated problems, and the efforts already spent on them were excessive. This individual would then presumably point at the year 2000 problem as another example of something that generated more concern than it should have—a <em>mass hysteria</em>, if you will.</p>\n<p>It is true that the year 2000 problem could have been addressed earlier. Even before the 1990s, people were aware of the fact that the year 2000 may happen in the future, and that some of their computer systems were incapable of handling it properly. Eventually, though, enough awareness of the problem was raised in the mainstream, and successful mitigations were applied, which is something many wish would happen with problems of today. There are, however, some fundamental differences between then and now.</p>\n<h2 id=\"information-in-the-1990s\">Information in the 1990s</h2>\n<p>One difference between the 2020s and the 1990s that <cite>Time Bomb Y2K</cite> highlights is in how people interacted with news, and information in general.</p>\n<p>A point that frequently comes up in news reports and interviews shown in the film is that the looming problem may affect even those who do not own a computer. This made things more scary for people of the 1990s, but to the people of 2020s, it also serves as a reminder that back then, it was less common for people to get their news <em>through</em> a computer.</p>\n<p>From the <i>vox pop</i> interviews included in the film, one gets the impression that people mostly had only a vague idea of what the year 2000 problem was, how it would affect them, and what was being done to fix it. These people have the idea that someone, somewhere is doing something to fix it, but they do not know the details, and they do not know how well that is going.</p>\n<p>This sort of mass <em>anxiety</em> stands in contrast with the modern day. In the 2020s, people are far more confident in expressing opinions and providing explanations about what is happening, regardless of whether those are based in fact or not. It may be tempting to think that misinformation in the 1990s spread the same way it spreads now, but it could not have, and so things worked differently.</p>\n<h2 id=\"the-fringes-and-their-realities\">The fringes and their realities</h2>\n<p>The film also covers the more fringe responses to the impending year 2000. These range from the more ordinary individualist preppers, to right wing militias, and apocalyptic religious movements.</p>\n<p>Some people shown take reasonable precautions, ensuring they have enough supplies to survive a possible disaster, in concert with their community. Some take a more extreme approach, stockpiling a remote residence in the mountains with food and guns for the coming societal breakdown. Some quit their jobs and take up homesteading, since the whole civilization thing is going to end anyway.</p>\n<aside aria-label=\"Illustrative figure 1\">\n<figure class=\"figwide\" id=\"figure-1\">\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/audience.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/audience.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/audience.png\" loading=\"lazy\" width=\"900\" height=\"677\" alt=\"About two dozen people sitting on a variety of chairs facing the same way, in a room that looks like a large conference room in a hotel or a community center. The people all have hand-written name tag stickers on their clothing.\">\n</picture>\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/skip.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/skip.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/skip.png\" loading=\"lazy\" width=\"900\" height=\"677\" alt=\"An older man with a name tag that says &quot;Skip&quot; and a serious expression gesticulates as he explains something. Behind him is a home-made poster featuring a monster with evil looking eyes and many teeth made of paper protruding from the poster, as well the digits &quot;2000&quot; over the monster&apos;s face.\">\n</picture>\n<figcaption>The film showed scenes from some sort of a presentation where people (Skip here among them) talked about the importance of community preparedness.</figcaption>\n</figure>\n</aside>\n<p>There are also religious leaders who preach about the coming apocalypse, connecting the end of the world predictions associated with the year 2000 problem with more traditional apocalyptic prophesies. Depicted, too, are right-wing militias, who believe that the year 2000 will usher both a societal collapse, and prompt the federal government of the United States to seize dictatorial levels of control over the nation.</p>\n<aside aria-label=\"Illustrative figure 2\">\n<figure class=\"figwide\" id=\"figure-2\">\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/militia.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/militia.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/militia.png\" loading=\"lazy\" width=\"900\" height=\"677\" alt=\"Slightly blurry still of a number of people in 1990s-style camo fatigues walking through an open field, carrying rifles\">\n</picture>\n<figcaption>Members of a militia, practicing for the sort of violence that right-wing militias would like to see themselves participating in. Still from the film.</figcaption>\n</figure>\n</aside>\n<p>Right-wingers with a fondness for AR-15s and conspiracy theories regarding a coming New World Order are also a familiar feature of 2024. Their modern-day iteration is, however, more frequently associated with denialism. A modern day right-wing militia may believe that efforts to combat the climate crisis are a plot to gain power (by whatever group the militia is bigoted against), but they will also believe that the climate crisis is made up in the first place.</p>\n<p>From the film, one can get the impression that fringe right-wing and conservative movements in the year 2000 problem era were far closer to the shared reality of everyone else. They kept one foot outside of that reality, as manifested by their theories about conspiracies or religious apocalypses, but they built their narratives on the understanding of the world shared by the broader population. This stands in contrast with COVID-19 or the climate crisis, where the fringe right-wing elements often advance the idea that those things are not real, or if they are, the mitigation efforts are not real.</p>\n<h2 id=\"the-use-of-the-story-today\">The use of the story today</h2>\n<p>It is tempting to use the year 2000 problem as an example of how we should do things now. On the surface, it seems like the good old times, when people would actually manage to solve major problems collectively. But, as <cite>Time Bomb Y2K</cite> reminds us, it was a different time, and it was a different problem. There is no going back to the culture of the 1990s—not that this would be such a good idea in the first place—and there is no going back to the way people interacted and communicated with each other back then.</p>\n<p>We are also not facing the year 2000 problem today. Ours are not looming apocalypses with set deadlines. Our problems are not addressed by a bunch of work done behind the scenes, in a world that keeps going on as usual. These problems are not the kind that stop banks from being able to count their money, spurring them to action; they are problems that result in poor people dying, and the 1990s were not that much better about dealing with those.</p>\n<p>The ending of the film depicts the sort of relived, happy, and optimistic mood present during the early year 2000. There is a shot of a sanitation worker disposing of a discarded sign that says &quot;THE END IS NEAR&quot;, as if thus getting rid of the gloom and fear of the recent times and ushering in a hopeful future. The very last scenes in the film are interviews with kids, sharing their hopes for a better future. This is the sort of a moment in time that people are likely to feel nostalgia for. A moment where people just got through a problem together, and were standing there looking towards the future, hopeful to be likewise unified in dealing with the problems of the coming century.</p>\n<aside aria-label=\"Illustrative figure 3\">\n<figure class=\"figwide\" id=\"figure-3\">\n<picture>\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/the-end-is-near.jxl\" type=\"image/jxl\">\n    <source srcset=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/the-end-is-near.webp\" type=\"image/webp\">\n    <img src=\"https://dee.underscore.world/blog/y2k-as-seen-on-tv-as-seen-now/the-end-is-near.png\" loading=\"lazy\" width=\"900\" height=\"677\" alt=\"A sanitation worker stands on trash-strewn ground, next to some garbage trucks, showing a paperboard sign to the camera and grinning\">\n</picture>\n<figcaption>Cleaning up in Times Square in New York City after the new year's celebrations, a sanitation worker finds a discarded &quot;THE END IS NEAR&quot; sign. Still from the film.</figcaption>\n</figure>\n</aside>\n<p>But, the story has, of course, been spoiled for anyone living in the year 2024. We know how the 2000s ended up, with their persisting bigotries, wars, crises, and suffering. With the benefit of hindsight, that view from 2000 seems awfully naive.</p>\n<p>It is true that the year 2000 problem was real. It is true that a lot of effort went into ensuring that it would not have serious consequences once the year 2000 actually rolled around. On the other hand, it is also true that, at the time, there were reactions among certain people that exaggerated the problem to advance their conspiracy theories and agendas. This particular sort of hysteria is useful to acknowledge, as echoes of it persist to 2024. However, it is important that, when trying to counteract the misconceptions or denialism about how this particular bit of history went down, we do not erase all the nuance. It is useful to take inspiration from past successes, but it is less useful to turn them into myths of glorious past times.</p>\n<section class=\"admonition small\">\n<p>Stills included here are from <cite>Time Bomb Y2K</cite>, copyright to which is held by Home Box Office, Inc.</p>\n</section>\n"},{"id":"https://dee.underscore.world/blog/emojis-how-theyre-made/","url":"https://dee.underscore.world/blog/emojis-how-theyre-made/","title":"Emojis: how they're made","date_published":"2023-12-22T00:46:33.000Z","date_modified":"2023-12-22T00:46:33.000Z","content_html":"<p>Currently, The Unicode Standard sees one major release per year (barring unusual exceptions), with an occasional additional point release outside of that normal pace. 2023 has been one of the exceptions, in that there was only minor release: 15.1.</p>\n<p>Both minor and major version updates can come with new emojis. These new emojis subsequently become available in various pieces of software... eventually, one hopes. To understand how new emojis make their way to actual software in use, it helps to understand what the data behind emojis is, where it comes from, and how it is used.</p>\n<h2 id=\"the-unicode-standard\">The Unicode Standard</h2>\n<p>What is often referred to as <dfn>Unicode</dfn> is, more precisely, <dfn>The Unicode Standard</dfn>. The Unicode Standard, published by the Unicode Consortium, consists of tables of codepoint assignments, as well as a number of documents describing how the standard's data should be used, and broader practices for handling Unicode text. The Standard includes machine-readable data files, the <dfn>Unicode Character Database</dfn> (<dfn>UCD</dfn>), which describes codepoint assignments, plus some related extra data.</p>\n<p>A <dfn>codepoint</dfn> (or <dfn>code point</dfn>) is essentially a number, and so The Unicode Standard determines what—if any—character is assigned to a given number. Plenty of codepoints are yet unassigned, and so new assignments can be introduced in new versions of the standard. Each character also has certain additional information associated with it, in the form of <dfn>character properties</dfn>. These include things like the identifying name, general category, and directionality (some characters are associated with left-to-right scripts, some are right-to-left, and some are more complicated than that). The UCD consists of a number of files which contain this data.</p>\n<p>Some—but not all—emojis are single characters, and these a get a single codepoint each. <code>UnicodeData.txt</code> is the file within the UCD that lists codepoint assignments, and so we can find such emojis in it:</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-plain\"><code class=\"language-plain\">1F979;FACE HOLDING BACK TEARS;So;0;ON;;;;;N;;;;;\n1F97A;FACE WITH PLEADING EYES;So;0;ON;;;;;N;;;;;\n1F97B;SARI;So;0;ON;;;;;N;;;;;\n1F97C;LAB COAT;So;0;ON;;;;;N;;;;;\n1F97D;GOGGLES;So;0;ON;;;;;N;;;;;\n1F97E;HIKING BOOT;So;0;ON;;;;;N;;;;;\n1F97F;FLAT SHOE;So;0;ON;;;;;N;;;;;</code></pre>\n<figcaption>Excerpt of <code>UnicodeData.txt</code>, part of the UCD. The columns are separated using <code>;</code>. The first two columns are the codepoint, and the name. The first line is the 🥹 emoji.</figcaption>\n</figure>\n<p>The UCD also includes files which specify which characters are actually emoji, and which should have emoji presentation (that is, which should show up in color). Furthermore, The Unicode Standard comes with additional emoji-related, computer-readable data tables that are not part of the UCD proper.</p>\n<p>Not all emojis are single characters—some are sequences. For example, the astronaut emoji (🧑‍🚀) is assembled from the adult emoji (🧑) and the rocket emoji (🚀), plus some glue characters. Similarly, the trans flag emoji (🏳️‍⚧️) involves combining the white flag emoji (🏳) and the trans symbol emoji (⚧), plus some glue characters. Such sequences do not require new codepoint assignments, and could theoretically be devised by vendors on an ad-hoc basis. The standard, however, includes a number of such sequences which are expected to be widely deployed—or, as Unicode puts it, <dfn>recommended for general interchange</dfn> (<dfn>RGI</dfn>). These sequences are one of the things specified in the emoji data tables that exist outside of the UCD, in a file that looks like this:</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-plain\"><code class=\"language-plain\">1F3F3 FE0F 200D 26A7 FE0F   ; RGI_Emoji_ZWJ_Sequence  ; transgender flag   # E13.0  [1] (🏳️‍⚧️)\n1F3F3 FE0F 200D 1F308       ; RGI_Emoji_ZWJ_Sequence  ; rainbow flag       # E4.0   [1] (🏳️‍🌈)\n1F3F4 200D 2620 FE0F        ; RGI_Emoji_ZWJ_Sequence  ; pirate flag        # E11.0  [1] (🏴‍☠️)</code></pre>\n<figcaption>Excerpt of <code>emoji-zwj-sequences.txt</code>, with whitespace trimmed for the sake of display. The columns are separated by <code>;</code>s, and <code>#</code>s indicate comments. The first column lists the characters in the sequence. The E-numbers in the comment say which version of emoji introduced this particular sequence.</figcaption>\n</figure>\n<p>Emoji additions to The Unicode Standard are handled by the <dfn>Emoji Subcommittee</dfn> (ESC), which operates under the Unicode Technical Committee. Proposals for new emojis (either codepoint or sequence) are open to the public, although they also sometimes originate from within the Consortium. For proposals that make it to the later stages of the approval process, the ESC will generally make the proposal public, and report on its progress. It is therefore possible to tell, ahead of time, what will end up in the next Unicode Standard release.</p>\n<h2 id=\"common-locale-data-repository\">Common Locale Data Repository</h2>\n<p>The <dfn>Common Locale Data Repository</dfn> (<dfn>CLDR</dfn>) is a project aimed at maintaining a standard repository of a variety of locale data. It is a project maintained by the Unicode Consortium, though it is not part of The Unicode Standard.</p>\n<p>The CLDR contains locale data for a wide range of language, region, and script combinations. This includes some of the more usual locale data, such as the way numbers or dates are formatted, or how the days of the week are written. The CLDR also contains some less obvious locale data, such as information on how person names in a given culture are usually collated, or how many different plural forms a language uses.</p>\n<p>Among the less obvious CLDR data are character annotations. These are used to assign a short name, and any number of keywords to every emoji. The short name is the main name that an emoji will have, and may be used for text-to-speech systems that need to read the emoji out loud. The keywords are any additional terms that may be useful when searching for that particular emoji. While The Unicode Standard gives each codepoint an English name, that name is intended more as an internal identifier, suitable for use in source code, and such names are only given to codepoints, not to emoji sequences. CLDR annotations, on the other hand, are for both single character emojis and sequences, and are different in every language, making them more suitable for the end user.</p>\n<p>Annotation data is in XML, and looks like this:</p>\n<figure class=\"figwide\" id=\"listing-3\">\n<pre class=\"language-xml\"><code class=\"language-xml\"><span class=\"token comment\">&lt;!-- en.xml --></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>annotation</span> <span class=\"token attr-name\">cp</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>🍆<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>aubergine | eggplant | vegetable<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>annotation</span><span class=\"token punctuation\">></span></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>annotation</span> <span class=\"token attr-name\">cp</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>🍆<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">type</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>tts<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>eggplant<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>annotation</span><span class=\"token punctuation\">></span></span>\n\n<span class=\"token comment\">&lt;!-- zh.xml --></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>annotation</span> <span class=\"token attr-name\">cp</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>🍆<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>茄子 | 蔬菜<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>annotation</span><span class=\"token punctuation\">></span></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>annotation</span> <span class=\"token attr-name\">cp</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>🍆<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">type</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>tts<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>茄子<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>annotation</span><span class=\"token punctuation\">></span></span>\n\n<span class=\"token comment\">&lt;!-- hi.xml --></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>annotation</span> <span class=\"token attr-name\">cp</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>🍆<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>बेंगन | बैंगन | सब्जी<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>annotation</span><span class=\"token punctuation\">></span></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>annotation</span> <span class=\"token attr-name\">cp</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>🍆<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">type</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>tts<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>बैंगन<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>annotation</span><span class=\"token punctuation\">></span></span></code></pre>\n<figcaption>The English, Chinese, and Hindi annotations for 🍆. Note that 🍆 is called <var>AUBERGINE</var> in The Unicode Standard, but the CLDR annotations for English use the more common name of <i>eggplant</i>; <i>aubergine</i> is included in the keywords, so that the emoji comes up if the user types either of the names.</figcaption>\n</figure>\n<p>The CLDR is managed by the <dfn>CLDR Technical Committee</dfn> (<dfn>CLDR-TC</dfn>) within the Unicode Consortium. The project is maintained by Consortium members, as well as affiliated institutions and individuals with relevant language expertise. Unnaffiliated public can also participate in a limited way, by filing requests for changes, to be reviewed by the members. The CLDR is released on its own six month schedule, independent of The Unicode Standard.</p>\n<h2 id=\"display\">Display</h2>\n<p>In order to render new emojis, software will generally require new graphics that depict them, in addition to possibly needing the latest version of the Unicode Character Database. Standards changes that would require code changes to the rendering stack are less common.</p>\n<p>As the UCD describes which characters are emojis, updating it may be necessary to inform software about which characters it should treat as emojis. When it comes to emoji handling by the operating system, this generally means a lower-level component or library, which may either vendor the UCD, take it in at build time, or vendor an already pre-processed version of the UCD.</p>\n<p>The other thing needed to display new emojis is the graphics to actually display them. While the Unicode Character Database does not contain any instructions on what a given emoji should look like, the proposals and other documentation on the Unicode website generally do include examples. These are also usually available prior to the formal release date for the given Unicode Standard version, so it is technically possible for vendors to have graphics ready before that date.</p>\n<p>Two popular permissively licensed emoji sets are Google's <a href=\"https://github.com/googlefonts/noto-emoji\" title=\"Noto Emoji repository\">Noto Emoji</a>, and Twitter's <a href=\"https://github.com/twitter/twemoji\" title=\"Twemoji repository\">Twemoji</a>. Both of these projects often put out a release that supports the new version of The Unicode Standard within about a month after The Unicode Standard's release, though some outliers do happen. In particular, Twemoji was affected by a petulant billionaire buying Twitter, which resulted in several people involved in handling of the project being fired, and apparent halt to Twitter's (now called &quot;X&quot;) releasing of emojis under a permissive license. The Twemoji project now continues as <a href=\"https://github.com/jdecked/twemoji/\" title=\"jdecked/twemoji repository\">a fork</a>, maintained by some of the original authors.</p>\n<p>Both Noto Emoji and Twemoji are shipped as a series of vector graphics. These graphics can be built into font files, and such fonts are generally how new emojis end up making it into desktop and mobile operating systems. Google ships the tooling needed to turn Noto Emojis into an OpenType font, and also makes pre-built font files available; Twemoji fonts can be built with a third party tool <a href=\"https://github.com/eosrei/twemoji-color-font\" title=\"twemoji-color-font repository\">twemoji-color-font</a>, with pre-built fonts likewise distributed by that project.</p>\n<p>Other times, the graphics are used more directly. Web applications, for example, may replace emoji characters by images loaded via HTML. In such situations, there is often a dedicated library which either brings its own graphics, or can be configured to use a separately provided graphics set It is these components which need to be updated in such a situation.</p>\n<h2 id=\"input\">Input</h2>\n<p>It is <em>technically</em> possible to input emojis the same way as any other arbitrary Unicode character: through character pickers, numerical codepoint entry, or copy-and-paste from an outside source. These methods are not very practical, though, which is why emoji pickers, and other emoji input methods exist.</p>\n<p>Emoji pickers can source their data from multiple places, but the Common Locale Data Repository is often among them. Like with the UCD, the CLDR data is often vendored, and is often pre-processed in some way prior to being vendored. As such, building the picker with an updated CLDR may require an extra step to update the CLDR-derived data files within the picker's source code.</p>\n<p>Emoji pickers also often use an intermediate dependency as a way to access emoji data. While with the UCD, software will generally get its Unicode data <em>just</em> from the UCD, emoji pickers may use CLDR partially, indirectly, or even not at all. Projects such as <a href=\"https://github.com/iamcal/emoji-data\" title=\"emoji-data repository\">emoji-data</a> or <a href=\"https://emojibase.dev/\" title=\"Emojibase website\">emojibase</a> maintain their own lists of emoji shortcodes which may be derived from CLDR data, but are not always a one-to-one mapping. Such projects may take the CLDR into consideration, but the shortcodes and other keywords are ultimately assigned by the project, and so subject to individual editorial control. This means updates are not automated.</p>\n<p>As with displaying emojis, emoji input also ends up implemented on multiple layers: there are OS-level emoji input methods, and individual apps or websites can have emoji input methods of their own. Of note is the fact that, while a website replacing emojis with its own images means the original glyphs are not rendered, a website providing shortcode expansion usually does not prevent an OS-level picker from inserting emojis like any other Unicode character.</p>\n<h2 id=\"considerations\">Considerations</h2>\n<p>When writing software that may require emojis data from the UCD or the CLDR, there are some things to consider.</p>\n<p>Parsing UCD or CLDR data directly at runtime may be impractical, and so pre-processing it into a format that can be included at build time often makes sense. In this case, it is also common to vendor the data, so that the UCD or CLDR data does not have to be supplied as a build-time dependency.</p>\n<p>New emojis are generally introduced in the UCD once a year, and the CLDR is updated twice a year. This represents an update that has to be performed on a regular basis, albeit rarely. When vendoring this data directly, it can be useful to set up procedures for updating it in the future. If possible, it can also be useful to set up a way for packagers to patch in the latest UCD or CLDR data at build time, even if it is otherwise vendored.</p>\n<p>When implementing emoji pickers on app level, it is good to remember that the user may actually have a functioning method for inputting emojis at the operating system level. Most of the time, when dealing with text input fields, this doesn't require any extra handling. There are, however, situations where the user is forced to input an emoji through the picker, like when adding a reaction to an item in a chat app. In such cases, the user may still wish to, for example, select one of the recent emojis from their phone's keyboard, which is something that the picker should facilitate.</p>\n<h2 id=\"note-on-license\">Note on license</h2>\n<p>Both the UCD, and the CLDR, including the parts excerpted here, are available under the Unicode Data Files and Software License. <a href=\"https://www.unicode.org/copyright.html\">More details are available on the Unicode website</a>.</p>\n<h2 id=\"further-reading\">Further reading</h2>\n<ul>\n<li><a href=\"https://www.unicode.org/emoji/techindex.html\">Unicode Emoji</a> – The home page for the section of the Unicode website that covers emoji from a technical and procedure perspective</li>\n<li><a href=\"https://www.unicode.org/reports/tr51/\">UTS #51: Unicode Emoji</a> – The Unicode standard which describes various aspects of how emoji work</li>\n<li><a href=\"https://cldr.unicode.org/\">Unicode CLDR</a> – Home page of the Common Locale Data Repository project</li>\n<li><a href=\"https://www.unicode.org/reports/tr35/\">UTS #35: Unicode Locale Data Markup Language (LDML)</a> – The Unicode standard which describes the format used by CLDR data; part 2 describes how character annotations work.</li>\n</ul>\n"},{"id":"https://dee.underscore.world/blog/building-pagefind-ui/","url":"https://dee.underscore.world/blog/building-pagefind-ui/","title":"Building a Pagefind UI","date_published":"2023-10-21T01:01:19.000Z","date_modified":"2023-10-21T01:01:19.000Z","content_html":"<p>One of the fun things about running your own personal website is that you can add things to it that are not strictly needed, but might be interesting to add anyway. While I try to not actually break the basic functionality of the website, I feel more free to add miscellaneous new stuff.</p>\n<p>This is how I ended up deciding to add <a href=\"https://pagefind.app/\" title=\"Pagefind website\">Pagefind</a>-powered search to my website. I assume that people do not regularly visit my website and then find themselves wishing it had a search feature, so that they could use it to locate a particular web log article, considering I have published fewer than 25 of those over the entire time this website has existed. But, it was <em>possible</em> for me to add such a feature, and so I did.</p>\n<h2 id=\"how-pagefind-works\">How Pagefind works</h2>\n<p>Search engine implementations for static websites can generally be divided into two categories: server-side and client-side. With server-side, something has to build an index of all the pages within the site, and then something has to handle search-related HTTP requests by consulting that index. Unlike a static website, such functionality cannot be hosted off a server that simply serves static files over HTTP—the thing that consults the index needs to be able to execute code on the back end, which makes hosting more complicated. This kind of search functionality often ends up outsourced to a third party.</p>\n<p>The other option for static website search is moving the search to the client. In the simplest implementation of this idea, the client downloads a search index of the website, along with some JavaScript code capable of using that index. These can be static files, so the server does not have to do anything fancy, but it also means the client has to do more work. While extra CPU cycles and memory use by the client can be an acceptable trade-off, the bigger problem can be transferring the index. Since such an index is essentially the whole website, but in a different shape, it can involve transferring quite a lot of data to the client.</p>\n<p>Pagefind solves the index size problem by splitting the index into chunks. While the index is still mostly the whole website in a different shape, the whole index does not have to be downloaded to perform a search. Instead, the client ends up downloading only the chunks it needs to search for the given terms, and to display the relevant results.</p>\n<p>Pagefind is implemented as a command line tool, written in Rust. The tool is designed to be operated on a local copy of the site's directory tree of HTML files, and it emits a subdirectory of files which implement the search engine, into that copy. It creates an index of all the pages in the site, while also gathering metadata about each page. It then splits that search index into chunks, and saves those index chunks, as well as the metadata for each page, into individual, compressed files. The tool also emits the client API code—a combination of JavaScript and WebAssembly—as well as an implementation of a search user interface that uses the API.</p>\n<h2 id=\"running-pagefind-from-eleventy\">Running Pagefind from Eleventy</h2>\n<p>Since I run Eleventy directly, and not from a build system or task runner, I wanted to run Pagefind as part of the Eleventy build process. After version 1.0, Pagefind comes with a Node.js interface, which runs the Pagefind binary, and talks to it over standard input and output. I, however, started working on this problem before Pagefind 1.0 was released, and so I went with the simpler method of invoking the Pagefind binary directly.</p>\n<p>Eleventy emits a number of events during a build process. One of those events is <code>eleventy.after</code>, which happens after a build is done (predictably enough). We can hook into this event to run Pagefind on the site, after it's done building:</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-javascript\"><code class=\"language-javascript\"><span class=\"token keyword\">const</span> util <span class=\"token operator\">=</span> <span class=\"token function\">require</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"node:util\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token keyword\">const</span> exec <span class=\"token operator\">=</span> util<span class=\"token punctuation\">.</span><span class=\"token function\">promisify</span><span class=\"token punctuation\">(</span><span class=\"token function\">require</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"node:child_process\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span>exec<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\nmodule<span class=\"token punctuation\">.</span><span class=\"token function-variable function\">exports</span> <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span><span class=\"token parameter\">eleventyConfig</span><span class=\"token punctuation\">)</span> <span class=\"token operator\">=></span> <span class=\"token punctuation\">{</span>\n  eleventyConfig<span class=\"token punctuation\">.</span><span class=\"token function\">on</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"eleventy.after\"</span><span class=\"token punctuation\">,</span> <span class=\"token keyword\">async</span> <span class=\"token punctuation\">(</span><span class=\"token parameter\"><span class=\"token punctuation\">{</span> dir <span class=\"token punctuation\">}</span></span><span class=\"token punctuation\">)</span> <span class=\"token operator\">=></span> <span class=\"token punctuation\">{</span>\n    <span class=\"token comment\">// using promisified exec(), since it's an expedient way to get something</span>\n    <span class=\"token comment\">// we can await here</span>\n    <span class=\"token keyword\">await</span> <span class=\"token function\">exec</span><span class=\"token punctuation\">(</span><span class=\"token template-string\"><span class=\"token template-punctuation string\">`</span><span class=\"token string\">npx pagefind --site=</span><span class=\"token interpolation\"><span class=\"token interpolation-punctuation punctuation\">${</span>dir<span class=\"token punctuation\">.</span>output<span class=\"token interpolation-punctuation punctuation\">}</span></span><span class=\"token string\"> --output-subdir=./pagefind</span><span class=\"token template-punctuation string\">`</span></span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span></code></pre>\n<figcaption>An Eleventy plugin that runs Pagefind.</figcaption>\n</figure>\n<p>The npm package for Pagefind pulls in a pre-compiled binary (at least for the supported platforms), so we are calling it through <code>npx</code>. Since we do not use Pagefind's Node.js API, we could also provision the Pagefind binary in some other way, get it into <code>PATH</code>, and call it directly. The npm approach is easier, but might be undesirable for situations where building such things locally is preferable.</p>\n<h2 id=\"adding-metadata\">Adding metadata</h2>\n<p>Pagefind can store arbitrary key–value pairs of metadata for each page it indexes, and the client API can then be used to access that data for each result it finds. Pagefind stores some metadata by default—for example, the title of a result—and then uses that metadata when rendering results with the default results interface.</p>\n<p>Since Pagefind indexes the finished HTML output of a static site, this is also where it gets its metadata from. To indicate to Pagefind where this metadata is, specific <code>data-*</code> attributes can be used. For example, if we wanted to store the <code>datetime</code> attribute of a <code>&lt;time&gt;</code> element under the key <code>date</code>, we would add <code>data-pagefind-meta=&quot;date[datetime]&quot;</code> to that <code>&lt;time&gt;</code> element:</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-html\"><code class=\"language-html\"><span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>article</span>\n  <span class=\"token attr-name\">class</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>content fullarticle h-entry<span class=\"token punctuation\">\"</span></span>\n  <span class=\"token attr-name\">data-pagefind-body</span>\n  <span class=\"token attr-name\">data-pagefind-meta</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>kind:weblog<span class=\"token punctuation\">\"</span></span>\n<span class=\"token punctuation\">></span></span>\n  <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>header</span><span class=\"token punctuation\">></span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>h1</span> <span class=\"token attr-name\">class</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>p-name<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>Reflashing brain implant firmware<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>h1</span><span class=\"token punctuation\">></span></span>\n\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>div</span> <span class=\"token attr-name\">class</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>metamatter<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>\n      <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>time</span>\n        <span class=\"token attr-name\">class</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>dt-published<span class=\"token punctuation\">\"</span></span>\n        <span class=\"token attr-name\">data-pagefind-meta</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>date[datetime]<span class=\"token punctuation\">\"</span></span>\n        <span class=\"token attr-name\">datetime</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>2035-10-03T03:27:25Z<span class=\"token punctuation\">\"</span></span>\n        <span class=\"token punctuation\">></span></span>2035 October 3, 03:27<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>time</span>\n      <span class=\"token punctuation\">></span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>div</span><span class=\"token punctuation\">></span></span>\n  <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>header</span><span class=\"token punctuation\">></span></span>\n  <span class=\"token comment\">&lt;!-- … --></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>article</span><span class=\"token punctuation\">></span></span></code></pre>\n<figcaption>An example HTML fragment, with Pagefind attributes for indicating what should be indexed (with <code>data-pagefind-body</code>), as well as ones for adding metadata.</figcaption>\n</figure>\n<p>The Pagefind documentation has <a href=\"https://pagefind.app/docs/metadata/\" title=\"&quot;Setting up metadata&quot;, Pagefind documentation\">detailed instructions</a> on how these attributes are used.</p>\n<h2 id=\"building-a-search-interface\">Building a search interface</h2>\n<p>Pagefind comes with a pre-made search interface, which can be included in a page by loading one of the JavaScript files from the Pagefind binary's output. This interface is <em>fine</em>, but I wanted to build my own, partly because it'd be fun, and partly because I wanted the interface to better integrate with my website.</p>\n<p>Building a custom search interface requires use of the Pagefind API. Fortunately, the API is fairly simple. A basic query can look like this:</p>\n<figure class=\"figwide\" id=\"listing-3\">\n<pre class=\"language-javascript\"><code class=\"language-javascript\"><span class=\"token keyword\">const</span> pagefind <span class=\"token operator\">=</span> <span class=\"token keyword\">await</span> <span class=\"token keyword\">import</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"/search/pagefind/pagefind.js\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token keyword\">await</span> pagefind<span class=\"token punctuation\">.</span><span class=\"token function\">options</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">{</span> <span class=\"token literal-property property\">bundlePath</span><span class=\"token operator\">:</span> <span class=\"token string\">\"/search/pagefind/\"</span><span class=\"token punctuation\">,</span> <span class=\"token literal-property property\">baseUrl</span><span class=\"token operator\">:</span> <span class=\"token string\">\"/\"</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token keyword\">await</span> pagefind<span class=\"token punctuation\">.</span><span class=\"token function\">init</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token keyword\">const</span> searchQuery <span class=\"token operator\">=</span> <span class=\"token keyword\">await</span> pagefind<span class=\"token punctuation\">.</span><span class=\"token function\">search</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"nixos\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span></code></pre>\n<figcaption>Executing a query for &quot;nixos&quot;.</figcaption>\n</figure>\n<p>Both the <code>init()</code> call and the <code>search()</code> call involve network requests. <code>init()</code> causes Pagefind to load general metadata and configuration relating to the search database, as well as the relevant WebAssembly bytecode. Once <code>init()</code> is done, Pagefind can figure out which chunks of index it should download for a given query, and these are the requests that <code>search()</code> makes.</p>\n<p>The object returned (within a promise) by the <code>search()</code> call does not, however, contain all the metadata for each result. Each potential result—which is to say, each page—has its own file containing both the metadata which Pagefind gathered, as well as the full searched text of the page. For each result, Pagefind can be asked to make a request for the corresponding file. The downside of this approach is the need to potentially make more requests to build a search results page, but the upsides are the fact that the client does not have to download metadata for pages that are not relevant. When using pagination, the client can also delay loading data for further pages, until it is needed.</p>\n<p>Fetching the data for a single result looks like this:</p>\n<figure class=\"figwide\" id=\"listing-4\">\n<pre class=\"language-javascript\"><code class=\"language-javascript\"><span class=\"token keyword\">const</span> result <span class=\"token operator\">=</span> <span class=\"token keyword\">await</span> searchQuery<span class=\"token punctuation\">.</span>results<span class=\"token punctuation\">[</span><span class=\"token number\">0</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">.</span><span class=\"token function\">data</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\nconsole<span class=\"token punctuation\">.</span><span class=\"token function\">log</span><span class=\"token punctuation\">(</span>result<span class=\"token punctuation\">.</span>meta<span class=\"token punctuation\">.</span>date<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span> <span class=\"token comment\">// → \"2023-08-30T13:56:17Z\"</span>\nconsole<span class=\"token punctuation\">.</span><span class=\"token function\">log</span><span class=\"token punctuation\">(</span>result<span class=\"token punctuation\">.</span>excerpt<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span> <span class=\"token comment\">// → \"credentials and &lt;mark>NixOS&lt;/mark> containers. 2023\"</span></code></pre>\n<figcaption>Fetching the details of a single search result, including parsed metadata, and an excerpt.</figcaption>\n</figure>\n<p>Pagefind returns all the metadata fields as strings, which means something like the date field might need to be parsed back into a <code>Date</code> object.</p>\n<h2 id=\"reusing-pieces-of-eleventy-templates\">Reusing pieces of Eleventy templates</h2>\n<p>When listing web log posts, my website uses a template fragment to render a <dfn>post strip</dfn>—a card which includes the publish date, the title, as well as a short summary of what the post is about. I use this on the front page, as well as on the page which lists all the published web log posts. It looks something like this:</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<picture>\n<source srcset=\"https://dee.underscore.world/blog/building-pagefind-ui/post-strip.webp\" type=\"image/webp\">\n<source srcset=\"https://dee.underscore.world/blog/building-pagefind-ui/post-strip.jxl\" type=\"image/jxl\">\n<img src=\"https://dee.underscore.world/blog/building-pagefind-ui/post-strip.png\" loading=\"lazy\" alt=\"A post strip for the article from the previous example, &quot;Reflashing brain implant firmware&quot;, now with added summary line that says &quot;Avoiding sending your thought analytics to Alphameta by using FOSS brain interface firmware&quot;\">\n</picture>\n<figcaption>An example post strip, as it might appear on my front page, for preferred dark color scheme.</figcaption>\n</figure>\n<p>I wanted to reuse the post strip for search results that are blog posts. Using post strips there would both provide consistent design, and also supply some relevant information about what each result <em>is</em>—an excerpt highlighting relevant terms does not always provide that. Pagefind's metadata support provided an obvious means to supply the relevant fields needed for rendering the strip.</p>\n<p>I was writing the search interface in vanilla JavaScript (to make it more fun), and ideally wanted to reuse the same partial template on the server side and the client side. This could be done with <a href=\"https://liquidjs.com/\" title=\"LiquidJS website\">LiquidJS</a>—which is one of the template engines that comes with Eleventy, and which is what I use for most of the site—but that would require the client to load the whole LiquidJS library (or at least a large portion of it), and that feels like overkill for a relatively simple partial template.</p>\n<p>Eleventy also supports <a href=\"https://ejs.co/\" title=\"EJS website\">EJS</a>. Since EJS supports pre-compiling templates, a template can potentially be used both from Eleventy at build time, and can also be bundled into the client-side JavaScript code for use there. The problem with doing this in practice is that a template written for use with Eleventy will end up using a bunch of Eleventy-supplied names and functions, and so it will not be easy to use from places outside Eleventy—like the client-side search script.</p>\n<p>Plain JavaScript is another template format supported by Eleventy, however. Thus, one solution for creating a reusable EJS template is to create a JavaScript wrapper which loads a generic, reusable template, takes a bunch of data from Eleventy, and renders the template by putting that data into a shape that the template can use. This is the solution I decided to go with, and since I was no longer using Eleventy's built in EJS support anyway, I decided to employ <a href=\"https://eta.js.org/\" title=\"Eta website\">Eta</a> instead (which is supposed to be like EJS, but better). I added an Eleventy filter for rendering Eta templates:</p>\n<figure class=\"figwide\" id=\"listing-5\">\n<pre class=\"language-javascript\"><code class=\"language-javascript\"><span class=\"token keyword\">const</span> path <span class=\"token operator\">=</span> <span class=\"token function\">require</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"node:path\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token keyword\">const</span> <span class=\"token punctuation\">{</span> Eta <span class=\"token punctuation\">}</span> <span class=\"token operator\">=</span> <span class=\"token function\">require</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"eta\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\nmodule<span class=\"token punctuation\">.</span><span class=\"token function-variable function\">exports</span> <span class=\"token operator\">=</span> <span class=\"token punctuation\">(</span><span class=\"token parameter\">eleventyConfig</span><span class=\"token punctuation\">)</span> <span class=\"token operator\">=></span> <span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">const</span> basePath <span class=\"token operator\">=</span> process<span class=\"token punctuation\">.</span><span class=\"token function\">cwd</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n  <span class=\"token keyword\">const</span> eta <span class=\"token operator\">=</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">Eta</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">{</span> <span class=\"token literal-property property\">views</span><span class=\"token operator\">:</span> basePath <span class=\"token punctuation\">}</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n  eleventyConfig<span class=\"token punctuation\">.</span><span class=\"token function\">addAsyncFilter</span><span class=\"token punctuation\">(</span>\n    <span class=\"token string\">\"renderEta\"</span><span class=\"token punctuation\">,</span>\n    <span class=\"token keyword\">async</span> <span class=\"token punctuation\">(</span><span class=\"token parameter\">templateFile<span class=\"token punctuation\">,</span> extra <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span><span class=\"token punctuation\">}</span></span><span class=\"token punctuation\">)</span> <span class=\"token operator\">=></span> <span class=\"token punctuation\">{</span>\n      <span class=\"token keyword\">return</span> <span class=\"token keyword\">await</span> eta<span class=\"token punctuation\">.</span><span class=\"token function\">renderAsync</span><span class=\"token punctuation\">(</span>templateFile<span class=\"token punctuation\">,</span> extra<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span>\n  <span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span></code></pre>\n<figcaption>An Eleventy filter, <code>renderEta</code>, which renders an Eta template contained in a file.</figcaption>\n</figure>\n<p>A wrapper can then look something like this:</p>\n<figure class=\"figwide\" id=\"listing-6\">\n<pre class=\"language-javascript\"><code class=\"language-javascript\"><span class=\"token comment\">// ./includes/post-strip.11ty.js</span>\n\n<span class=\"token keyword\">const</span> path <span class=\"token operator\">=</span> <span class=\"token function\">require</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"node:path\"</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\nmodule<span class=\"token punctuation\">.</span><span class=\"token function-variable function\">exports</span> <span class=\"token operator\">=</span> <span class=\"token keyword\">async</span> <span class=\"token keyword\">function</span> <span class=\"token punctuation\">(</span><span class=\"token parameter\">input</span><span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">const</span> templateData <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    <span class=\"token literal-property property\">url</span><span class=\"token operator\">:</span> <span class=\"token keyword\">this</span><span class=\"token punctuation\">.</span><span class=\"token function\">htmlBaseUrl</span><span class=\"token punctuation\">(</span>input<span class=\"token punctuation\">.</span>url<span class=\"token punctuation\">,</span> input<span class=\"token punctuation\">.</span>data<span class=\"token punctuation\">.</span>globalBaseUrl<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n    <span class=\"token literal-property property\">postDate</span><span class=\"token operator\">:</span> input<span class=\"token punctuation\">.</span>data<span class=\"token punctuation\">.</span>date<span class=\"token punctuation\">,</span>\n    <span class=\"token literal-property property\">postTitle</span><span class=\"token operator\">:</span> input<span class=\"token punctuation\">.</span>data<span class=\"token punctuation\">.</span>title<span class=\"token punctuation\">,</span>\n    <span class=\"token literal-property property\">postSummary</span><span class=\"token operator\">:</span> input<span class=\"token punctuation\">.</span>data<span class=\"token punctuation\">.</span>summary<span class=\"token punctuation\">,</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n  <span class=\"token keyword\">return</span> <span class=\"token keyword\">await</span> <span class=\"token keyword\">this</span><span class=\"token punctuation\">.</span><span class=\"token function\">renderEta</span><span class=\"token punctuation\">(</span><span class=\"token string\">\"./includes/eta/post-strip.eta\"</span><span class=\"token punctuation\">,</span> templateData<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span></code></pre>\n<figcaption>An Eleventy-specific wrapper for a generic Eta template.</figcaption>\n</figure>\n<p>I can then use <code>renderFile</code> to call the wrapper, from a place like the web log listing page:</p>\n<figure class=\"figwide\" id=\"listing-7\">\n<pre class=\"language-liquid\"><code class=\"language-liquid\"><span class=\"token liquid language-liquid\"><span class=\"token delimiter punctuation\">{%</span> <span class=\"token keyword\">for</span> post <span class=\"token keyword\">in</span> collections<span class=\"token punctuation\">.</span>blogpost <span class=\"token keyword\">reversed</span> <span class=\"token delimiter punctuation\">-%}</span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>li</span><span class=\"token punctuation\">></span></span>\n        <span class=\"token liquid language-liquid\"><span class=\"token delimiter punctuation\">{%</span> renderFile <span class=\"token string\">\"./includes/post-strip.11ty.js\"</span><span class=\"token punctuation\">,</span> post <span class=\"token delimiter punctuation\">%}</span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>li</span><span class=\"token punctuation\">></span></span>\n<span class=\"token liquid language-liquid\"><span class=\"token delimiter punctuation\">{%-</span> <span class=\"token keyword\">endfor</span> <span class=\"token delimiter punctuation\">%}</span></span></code></pre>\n<figcaption>Fragment of a liquid template that iterates over web log posts and renders a post strip for each one, by calling the JavaScript wrapper</figcaption>\n</figure>\n<p>For the client side, I use Rollup to pre-compile the Eta template, and then have a function which takes a single Pagefind search result object, and renders a post strip out of it:</p>\n<figure class=\"figwide\" id=\"listing-8\">\n<pre class=\"language-javascript\"><code class=\"language-javascript\"><span class=\"token keyword\">import</span> postStrip <span class=\"token keyword\">from</span> <span class=\"token string\">\"./includes/eta/post-strip.eta\"</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">// actually substituted as part of the build process</span>\n<span class=\"token keyword\">const</span> baseUrl <span class=\"token operator\">=</span> <span class=\"token string\">\"https://dee.underscore.world\"</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">// …</span>\n\n<span class=\"token keyword\">function</span> <span class=\"token function\">getBlogResultHtml</span><span class=\"token punctuation\">(</span><span class=\"token parameter\">data</span><span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n  <span class=\"token keyword\">return</span> <span class=\"token function\">postStrip</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">{</span>\n    <span class=\"token literal-property property\">url</span><span class=\"token operator\">:</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">URL</span><span class=\"token punctuation\">(</span>data<span class=\"token punctuation\">.</span>url<span class=\"token punctuation\">,</span> baseUrl<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n    <span class=\"token literal-property property\">postDate</span><span class=\"token operator\">:</span> <span class=\"token keyword\">new</span> <span class=\"token class-name\">Date</span><span class=\"token punctuation\">(</span>data<span class=\"token punctuation\">.</span>meta<span class=\"token punctuation\">.</span>date<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span>\n    <span class=\"token literal-property property\">postTitle</span><span class=\"token operator\">:</span> data<span class=\"token punctuation\">.</span>meta<span class=\"token punctuation\">.</span>title<span class=\"token punctuation\">,</span>\n    <span class=\"token literal-property property\">postSummary</span><span class=\"token operator\">:</span> data<span class=\"token punctuation\">.</span>meta<span class=\"token punctuation\">.</span>summary<span class=\"token punctuation\">,</span>\n    <span class=\"token literal-property property\">excerpt</span><span class=\"token operator\">:</span> data<span class=\"token punctuation\">.</span>excerpt<span class=\"token punctuation\">,</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption>Using the same post strip template, but from the client side</figcaption>\n</figure>\n<h2 id=\"in-more-detail\">In more detail</h2>\n<p>For a more detailed example I have <a href=\"https://git.underscore.world/d/d-u-w-eleventy-example/src/branch/post/building-pagefind-ui\" title=\"Repository on git.underscore.world\">a branch</a> of my Eleventy example repository that contains enough code to roughly reproduce what my actual website does for Pagefind.</p>\n"},{"id":"https://dee.underscore.world/blog/ingesting-secrets-as-a-daemon/","url":"https://dee.underscore.world/blog/ingesting-secrets-as-a-daemon/","title":"Ingesting secrets as a daemon","date_published":"2023-09-15T15:35:01.000Z","date_modified":"2023-09-15T15:35:01.000Z","content_html":"<p>Server software and other long-running daemons sometimes need to have access to local secrets. These can include things like private keys for a X.509 certificates, passwords for talking to an SMTP server, or access tokens for communicating with other services. In some setups, secrets can be obtained from a remote service over network, but other times there is a need for the secret to already be present on the machine.</p>\n<p>An administrator may wish to keep such secrets separate from other kinds of configuration the daemon needs. An example of such approach is NixOS, where generated configuration files often end up in the world-readable Nix store, and so it is desirable for them to not contain secrets. It is also a useful practice in other deployments. For example, one may wish to make a configuration file public, and if such a file contains secrets, this can lead to their inadvertent disclosure.</p>\n<p>The ways that secrets can be kept separate from other configuration depends on a the particular software's approach to handling configuration. Some implementations anticipate the need to separate secrets out, while others necessitate the use of workarounds.</p>\n<h2 id=\"a-plain-config-file\">A plain config file</h2>\n<p>The simplest approach to handling secrets is to have them specified in a configuration file, just like every other configuration directive. For daemon authors, this makes things easier—there are libraries for loading settings from common serialization formats, like TOML or YAML, and using them means less need to write extra logic.</p>\n<p>The way NixOS modules deal with such situations is either string replacement, or processing structured data. In either case, a config file is first written out to the Nix store without the secrets in it. Before the daemon starts up, the config file is copied to ephemeral storage (such as tmpfs), and the secrets are read from elsewhere and inserted into it. With the string replacement approach, the input config file has placeholders in it, and these are searched for and replaced with actual secrets. When the file uses a common format like YAML or JSON, the other approach is to use tools like <a href=\"https://mikefarah.gitbook.io/yq/\" title=\"Website for yq (Python)\">yq</a> (the one written in Python), <a href=\"https://kislyuk.github.io/yq/\" title=\"Website for yq (Go)\">yq</a> (different one, written in Go), or <a href=\"https://jqlang.github.io/jq/\">jq</a> to parse the config file, and insert the secret key–value pairs where they are needed. Once the secrets are in the config file copy, the daemon can be launched, and—if needed—given the path to the modified config file copy as a command line argument.</p>\n<p>One thing to keep in mind when pre-processing config files in this manner is that command line arguments for running processes are usually readable by any user on the system. This means that any secrets provided on the command line—such as with <code>sed</code> replacements—could be intercepted. With NixOS, workarounds for this problem include use of <code>--rawfile</code> (with jq and the Python yq); use of the <code>load_str</code> operator (with the Go yq); or, in place of <code>sed</code>, using <code>replace-secret</code>, a small utility written for the purpose. All of these means look for secrets in separate files, instead of command line arguments, enabling easier access control.</p>\n<h2 id=\"multiple-config-files\">Multiple config files</h2>\n<p>Some daemons—especially ones with bespoke configuration file formats—have the option of reading multiple configuration files. This can include <code>config.d/</code>-type solutions, where the daemon reads every file inside some specific directory, or support for <code>include</code>-type directives, where a configuration file can specify other configuration files to read.</p>\n<p>In this situation, a separate config file can be created with just the sensitive directives. Such config files can, likewise, be placed in ephemeral storage, and then included in the rest of the configuration hierarchy, either though the use of symlinks, or through <code>include</code> directives given the secret file's path. Such separation is useful even setups not facilitated by tools like NixOS modules, since it decreases the chance of accidentally committing secrets to version control, or publishing them alongside the rest of the configuration.</p>\n<h2 id=\"environment-variables\">Environment variables</h2>\n<p>A common way for daemon software to support configuration—advocated by the Twelve-Factor App methodology—is via environment variables. Environment variables are handy for deployments with Docker and similar platforms, as feeding environment variables to the inside of a container is often easier than handling configuration files. Implementations usually merge configuration sources, constructing the effective configuration from the loaded config files, overlaid with the supplied environment variables. Environment variables thus provide a way to insert secrets into a separately specified configuration.</p>\n<p>There are multiple options for getting secrets into environment variables. If the secrets are in their own files, an expedient solution is a wrapper script that reads the secrets into environment variables before launching the main daemon. <a href=\"https://github.com/getsops/sops\" title=\"Sops repository\">Sops</a> (by itself, without sops-nix) has an <code>exec-env</code> subcommand, for loading a Sops file into an environment, and then executing a process in that environment. systemd units files have the <a href=\"https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Environment=\" title=\"systemd.exec manual page, Environment section\"><code>Environment</code></a> and <a href=\"https://www.freedesktop.org/software/systemd/man/systemd.exec.html#EnvironmentFile=\" title=\"systemd.exec manual page, Environment section\"><code>EnvironmentFile</code></a> directives. The latter directive is frequently used in NixOS modules, with the whole environment file treated as a secret.</p>\n<p>There are some problems with sticking secrets in environment variables. The situation is not entirely dire: unlike with command line arguments, by default, a process's environment is not trivially readable to any user on the system. On the other hand, using environment variables for secrets makes the variables sensitive, and so they have to be handled more carefully—for example, debug logs dumping the entire environment could leak secrets. Child processes also inherit the parent's environment, even if they end up running as a different user, which means they could gain access to secrets they should not have access to. systemd subscribes to the idea that environment variables are a poor way to supply secrets to processes, and so it does not have native support for putting credentials obtained via <code>LoadCredential</code> in environment variables. systemd's credential functionality is happy to provide <em>the path</em> to a file containing the secret in an environment variable, but not that secret's actual value.</p>\n<h2 id=\"secret-files\">Secret files</h2>\n<p>Separate files, each containing a secret, are a common interface for providing secrets to software. This is how systemd credentials are exposed inside systemd services, and the way NixOS secret tools such as agenix or sops-nix work.</p>\n<p>Sometimes, this approach is supported natively by the daemon itself. A common case is PEM-encoded private keys, used for TLS (Transport Layer Security), which are generally awkward to handle inline in configuration files, since multiline strings are frustrating to input with many config file formats. Some software goes further, and allows any configuration option to be specified as the path to a file, where the contents of the file are what the option will be set to.</p>\n<p>To handle this approach, secret files have to be written out to some predictable location, and assigned permissions that allow the daemon to read them. Secret management tools generally provide a way to write a secret out to a file, or to standard output, which makes separate secret files a good baseline interface. systemd prefers this approach, and additionally provides a way to specify the path to a secret in an environment variable; it considers this safer than storing the actual secret in the variable, as the secret files are not world-readable.</p>\n<h2 id=\"which-to-use%3F\">Which to use?</h2>\n<p>When starting a greenfield project, especially one without any aspirations for high complexity, it is often easy to drop in an existing config-parsing library that uses a common file format, and call it a day. While using a common file format makes it easier to generate the config file with external tools (such as NixOS modules), the lack of specific support for secrets makes handling the config file potentially more difficult.</p>\n<p>Fortunately, there are often Twelve-Factor–themed libraries which make it easy to add support for overlaying of configuration via environment variables. This provides a simple option for specifying secrets in a more out-of-band manner.</p>\n<p>Reading each secret from an individual file is the preferred method for some systems, and so that is also a nice thing to support. This tends to be less readily available in config libraries, although it is not unheard of.</p>\n<p>There are tool-specific approaches that can be specifically supported, but these simpler, generic interfaces form a good baseline.</p>\n<h2 id=\"further-reading\">Further reading</h2>\n<ul>\n<li><a href=\"https://dee.underscore.world/blog/systemd-credentials-nixos-containers/\">systemd credentials and NixOS containers</a> – An article on this web log, which discusses some aspects of systemd's credential handling.</li>\n</ul>\n"},{"id":"https://dee.underscore.world/blog/systemd-credentials-nixos-containers/","url":"https://dee.underscore.world/blog/systemd-credentials-nixos-containers/","title":"systemd credentials and NixOS containers","date_published":"2023-08-30T13:56:17.000Z","date_modified":"2023-08-30T13:56:17.000Z","content_html":"<p>Managing secrets needed by daemons running under NixOS can be tricky. The Nix store is world-readable (any process on the machine can read it), and so it is not a good idea to write config files or scripts to it, if they include secrets. Fortunately, solutions like <a href=\"https://github.com/Mic92/sops-nix\" title=\"sops-nix repository\">sops-nix</a> and <a href=\"https://github.com/ryantm/agenix\" title=\"agenix repository\">agenix</a> exist. The idea behind both of those tools is to have the secrets stored in an encrypted format, with the key outside of the store. At system activation, the secrets are decrypted, and each is placed in its own file with a predictable path, inside a ramfs mount. Services that need the secrets can read them from that path.</p>\n<p>This is all very nice, but things get more complicated once we add NixOS containers into the mix. NixOS containers share the Nix store with the host system, but the rest of their directory tree is (by default) isolated, so they do not see the decrypted secrets under the host's <code>/run</code>. Getting secrets from the host into the container by copying or mounting is non-trivial, and giving the container its own key means having to maintain a separate set of encrypted secrets, specifically for that container.  An alternative exists: systemd credentials.</p>\n<h2 id=\"systemd-credentials\">systemd credentials</h2>\n<p>The systemd approach to handling secrets—which, in systemd land are generally referred to as <i>credentials</i>—is similar to the approach that sops-nix and agenix take. Short, sensitive pieces of data are made available under a filesystem path, while only existing in ephemeral storage.</p>\n<p>In a systemd unit definition, the <code>LoadCredential</code> directive can be used to—predictably—load a credential. Each credential will be placed in a file, and the path to the directory with these files will be passed to the unit's processes using the <code>CREDENTIALS_DIRECTORY</code> environment variable. systemd will ensure that the credentials are only readable by the unit's user.</p>\n<p>While the systemd's credentials facility may seem a bit redundant with agenix and sops-nix, the added benefit here is the automated handling of permissions on the credential files. sops-nix and agenix can both set the owners of the secrets they provision, but this has to be kept in sync with the service's user, and becomes more of a problem if the service uses dynamic users. On the other hand, if sops-nix or agenix leave the secrets owned as root, systemd's <code>LoadCredential</code> can still load them, and ensure the service's processes can read them.</p>\n<p>As an example of this pattern, assuming the configuration has sops-nix properly set up, could look something like this:</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span>\n<span class=\"token punctuation\">{</span>\n  <span class=\"token comment\"># … general sops-nix config here if needed …</span>\n\n  <span class=\"token comment\"># Default permissions for this secret, so it's owned by root</span>\n  sops<span class=\"token punctuation\">.</span>secrets<span class=\"token punctuation\">.</span>greetings_target <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span> \n\n  systemd<span class=\"token punctuation\">.</span>services<span class=\"token punctuation\">.</span>hello<span class=\"token operator\">-</span>sayer <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    wantedBy <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"multi-user.target\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n    serviceConfig <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n      LoadCredential <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n        <span class=\"token comment\"># specified like: &lt;filename inside unit>:&lt;source of secret></span>\n        <span class=\"token string\">\"target:<span class=\"token interpolation\"><span class=\"token antiquotation important\">$</span><span class=\"token punctuation\">{</span>config<span class=\"token punctuation\">.</span>sops<span class=\"token punctuation\">.</span>secrets<span class=\"token punctuation\">.</span>greeting_target<span class=\"token punctuation\">.</span>path<span class=\"token punctuation\">}</span></span>\"</span>\n\n        <span class=\"token comment\"># Using agenix, perhaps:</span>\n        <span class=\"token comment\"># </span>\n        <span class=\"token comment\"># \"target:${config.age.secrets.greeting_target.path}\"</span>\n      <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n      Environment <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n        <span class=\"token comment\"># We can also use %d to have systemd substitute the credentials</span>\n        <span class=\"token comment\"># directory path inside the unit configuration</span>\n        <span class=\"token string\">\"GREETING_TARGET_FILE=%d/target\"</span> <span class=\"token comment\"># == $CREDENTIALS_DIRECTORY/target</span>\n      <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n      User <span class=\"token operator\">=</span> <span class=\"token string\">\"hello-sayer\"</span><span class=\"token punctuation\">;</span>\n      DynamicUser <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span> \n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n    script <span class=\"token operator\">=</span> <span class=\"token string\">''\n      echo \"👋 Hello, $(cat ''${CREDENTIALS_DIRECTORY}/target)!\"\n      echo \"I hope you are doing well, $(cat $GREETING_TARGET_FILE).\"\n    ''</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption>An example systemd service that greets a target, with the target being specified by a sops-nix secret.</figcaption>\n</figure>\n<p>Let's assume we've set the <code>greetings_target</code> secret to &quot;Earth&quot;. We can check that it worked:</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token command\"><span class=\"token shell-symbol important\">#</span> <span class=\"token bash language-bash\">journalctl <span class=\"token parameter variable\">--output</span> <span class=\"token function\">cat</span> --pager-end <span class=\"token parameter variable\">--unit</span> hello-sayer.service</span></span>\n<span class=\"token output\">Started hello-sayer.service.\n👋 Hello, Earth!\nI hope you are doing well, Earth.\nhello-sayer.service: Deactivated successfully.\n</span></code></pre>\n<figcaption>Checking the journal to see that we got the greeting we wanted.</figcaption>\n</figure>\n<h2 id=\"nixos-containers\">NixOS containers</h2>\n<p>Loading credentials into systemd units has some uses, but how does it help with NixOS containers? Turns out, systemd's credentials concept extends not only to loading credentials into systemd services, but also to loading credentials into the system itself.</p>\n<p>When running a virtual machine or a container, the host machine can pass any number of credentials to the init system inside the guest, at the time of the latter's startup. The guest init system can subsequently pass some of those credentials on to the services it launches.</p>\n<p>systemd supports a number of methods of passing credentials into the guest system. One is SMBIOS OEM strings, which QEMU can set, and which systemd will read at startup, looking for ones starting with <code>io.systemd.credential</code>. Another method—more useful for containers—is similar to that used inside systemd services: give the PID 1 init process an environment variable called <code>CREDENTIALS_DIRECTORY</code>, containing the path to a directory containing credential files.</p>\n<p><code>systemd-nspawn</code> uses the latter method internally. It will set up a ramfs mount inside the container's namespace, write out the secrets to it, and set <code>CREDENTIALS_DIRECTORY</code> to its path. It can be told to do this by using the <code>--load-credential</code> option.</p>\n<p><code>systemd-nspawn</code> is also what NixOS containers are based on. Although, as of writing, there is no NixOS option for passing credentials into NixOS containers, the freeform <code>extraFlags</code> can be used to just pass the relevant <code>--load-credential</code> argument to <code>systemd-nspawn</code>:</p>\n<figure class=\"figwide\" id=\"listing-3\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span>\n<span class=\"token punctuation\">{</span>\n  <span class=\"token comment\"># … general sops-nix config here if needed …</span>\n\n  sops<span class=\"token punctuation\">.</span>secrets<span class=\"token punctuation\">.</span>greetings_target <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span> \n    <span class=\"token comment\"># since the secret is only passed in at startup, we have to restart the </span>\n    <span class=\"token comment\"># whole container if it changes</span>\n    restartUnits <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"container@helloing-container.service\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span> \n\n  containers<span class=\"token punctuation\">.</span>helloing<span class=\"token operator\">-</span>container<span class=\"token punctuation\">.</span>config <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span> <span class=\"token punctuation\">{</span>\n    system<span class=\"token punctuation\">.</span>stateVersion <span class=\"token operator\">=</span> <span class=\"token string\">\"23.05\"</span><span class=\"token punctuation\">;</span>\n\n    systemd<span class=\"token punctuation\">.</span>services<span class=\"token punctuation\">.</span>hello<span class=\"token operator\">-</span>sayer <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n      wantedBy <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"multi-user.target\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n      serviceConfig <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n        LoadCredential <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n          <span class=\"token comment\"># systemd will look for a system-level credential called hello-target. </span>\n          <span class=\"token comment\"># It does this when the source is a relative path</span>\n          <span class=\"token string\">\"target:hello-target\"</span>\n\n          <span class=\"token comment\"># in fact, if we do not include a colon, systemd will look for </span>\n          <span class=\"token comment\"># a system credential with the name given, and write it out for the </span>\n          <span class=\"token comment\"># unit under the same name, so this would give us</span>\n          <span class=\"token comment\"># $CREDENTIALS_DIRECTORY/hello-target</span>\n          <span class=\"token comment\"># </span>\n          <span class=\"token comment\"># \"hello-target\"</span>\n        <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n        Environment <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n          <span class=\"token string\">\"GREETING_TARGET_FILE=%d/target\"</span>\n        <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n        User <span class=\"token operator\">=</span> <span class=\"token string\">\"hello-sayer\"</span><span class=\"token punctuation\">;</span>\n        DynamicUser <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span> \n      <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n      script <span class=\"token operator\">=</span> <span class=\"token string\">''\n        echo \"👋 Hello, $(cat ''${CREDENTIALS_DIRECTORY}/target)!\"\n        echo \"I hope you are doing well, $(cat $GREETING_TARGET_FILE).\"\n      ''</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  containers<span class=\"token punctuation\">.</span>helloing<span class=\"token operator\">-</span>container <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    autoStart <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n    extraFlags <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n      <span class=\"token comment\"># same kind of target:source syntax</span>\n      <span class=\"token string\">\"--load-credential=hello-target:<span class=\"token interpolation\"><span class=\"token antiquotation important\">$</span><span class=\"token punctuation\">{</span>config<span class=\"token punctuation\">.</span>sops<span class=\"token punctuation\">.</span>secrets<span class=\"token punctuation\">.</span>greeting_target<span class=\"token punctuation\">.</span>path<span class=\"token punctuation\">}</span></span>\"</span>\n    <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption>Moving the script service to the inside of a container, where it can continue saying hello.</figcaption>\n</figure>\n<p>Since <code>systemd-nspawn</code> containers implement systemd's container interface, we can peek at the container's journals using the <code>--machine</code> flag:</p>\n<figure class=\"figwide\" id=\"listing-4\">\n<pre class=\"language-shellsession\"><code class=\"language-shellsession\"><span class=\"token command\"><span class=\"token shell-symbol important\">#</span> <span class=\"token bash language-bash\">journalctl <span class=\"token parameter variable\">--output</span> <span class=\"token function\">cat</span> --pager-end <span class=\"token parameter variable\">--unit</span> hello-sayer.service <span class=\"token parameter variable\">--machine</span> helloing-container</span></span>\n<span class=\"token output\">Started hello-sayer.service.\n👋 Hello, Moon!\nI hope you are doing well, Moon.\nhello-sayer.service: Deactivated successfully.\n</span></code></pre>\n<figcaption>Checking that the service inside the container got its secret</figcaption>\n</figure>\n<h2 id=\"possible-problems\">Possible problems</h2>\n<p>Use of systemd credentials is not without its problems.</p>\n<p>Under both agenix and sops-nix, secrets have predictable paths, and those paths are available at NixOS configuration build time. For a systemd service, the path is provided dynamically. The path is passed in by systemd using an environment variable, and while in practice it is <em>technically</em> predictable, this is not documented or supported, so ideally, we do not want to guess the path ahead of time.</p>\n<p>According to systemd, the ideal situation would be for the daemon itself to understand the <code>CREDENTIALS_DIRECTORY</code> environment variable, and go looking for the relevant secrets in there. More likely, some daemons might support being passed paths to a particular secret via a specific environment variable, in which case those variables can be made to point at the relevant credentials, using the <code>%d</code> specifier (as in the <code>hello-sayer</code> examples).</p>\n<p>For software where these solutions do not work, we have to resort to other tricks. Fortunately, the common NixOS pattern of replacing strings in config files before service start is easy to adapt for systemd credentials. With systemd 252 or later, the credentials directory is available in <code>ExecStartPre</code>, which is where we usually end up doing credential replacement:</p>\n<figure class=\"figwide\" id=\"listing-5\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> pkgs<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span> <span class=\"token keyword\">let</span>\n  <span class=\"token comment\"># This could also be generated by a NixOS module, for example</span>\n  inputConfig <span class=\"token operator\">=</span> pkgs<span class=\"token punctuation\">.</span>writeText <span class=\"token string\">\"hello-sayer.conf\"</span> <span class=\"token string\">''\n    {\n      \"hello-sayer\": {\n        \"target\": \"@TARGET@\"\n      }\n    }\n  ''</span><span class=\"token punctuation\">;</span>\n<span class=\"token keyword\">in</span> <span class=\"token punctuation\">{</span>\n  system<span class=\"token punctuation\">.</span>stateVersion <span class=\"token operator\">=</span> <span class=\"token string\">\"23.05\"</span><span class=\"token punctuation\">;</span>\n\n  systemd<span class=\"token punctuation\">.</span>services<span class=\"token punctuation\">.</span>hello<span class=\"token operator\">-</span>sayer <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    wantedBy <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"multi-user.target\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n    serviceConfig <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n      LoadCredential <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n        <span class=\"token string\">\"target:hello-target\"</span>\n      <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n      User <span class=\"token operator\">=</span> <span class=\"token string\">\"hello-sayer\"</span><span class=\"token punctuation\">;</span>\n      DynamicUser <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n\n      <span class=\"token comment\"># Adding a runtime directory so there is somewhere to put the</span>\n      <span class=\"token comment\"># config file with secrets substituted</span>\n      RuntimeDirectory <span class=\"token operator\">=</span> <span class=\"token string\">\"hello-sayer\"</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n    <span class=\"token comment\"># RUNTIME_DIRECTORY is also set by systemd</span>\n    preStart <span class=\"token operator\">=</span> <span class=\"token string\">''\n      install -m600 <span class=\"token interpolation\"><span class=\"token antiquotation important\">$</span><span class=\"token punctuation\">{</span>inputConfig<span class=\"token punctuation\">}</span></span> $RUNTIME_DIRECTORY/hello-sayer.conf\n\n      <span class=\"token interpolation\"><span class=\"token antiquotation important\">$</span><span class=\"token punctuation\">{</span>pkgs<span class=\"token punctuation\">.</span>replace<span class=\"token operator\">-</span>secret<span class=\"token punctuation\">}</span></span>/bin/replace-secret \\\n        '@TARGET@' \\\n        $CREDENTIALS_DIRECTORY/target \\\n        $RUNTIME_DIRECTORY/hello-sayer.conf\n    ''</span><span class=\"token punctuation\">;</span>\n\n    <span class=\"token comment\"># A real daemon might take something like</span>\n    <span class=\"token comment\"># `--config $RUNTIME_DIRECTORY/hello-sayer.conf`</span>\n    script <span class=\"token operator\">=</span> <span class=\"token string\">''\n      echo \"👋 Hello, $(<span class=\"token interpolation\"><span class=\"token antiquotation important\">$</span><span class=\"token punctuation\">{</span>pkgs<span class=\"token punctuation\">.</span>jq<span class=\"token punctuation\">}</span></span>/bin/jq -r '.\"hello-sayer\".target' &lt; $RUNTIME_DIRECTORY/hello-sayer.conf )!\"\n    ''</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption>Using string substitution to insert secrets into a config file, with the secrets supplied via systemd's credential facility</figcaption>\n</figure>\n<p>Another way secrets are often handled in NixOS modules is through environment variables. systemd has no provisions for setting an environment variable to the <em>content</em> of a credential, since systemd does not want us doing that (which is a story for another day), but system-level credentials <em>are</em> under a predictable path: <code>/run/credentials/@system</code>. We can commit systemd crimes by passing in an entire environment file as a credential, and then loading it in the unit:</p>\n<figure class=\"figwide\" id=\"listing-6\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span>\n<span class=\"token punctuation\">{</span>\n  <span class=\"token comment\"># … general sops-nix config here if needed …</span>\n\n  <span class=\"token comment\"># The contents of the secret are a systemd environment file, and look</span>\n  <span class=\"token comment\"># something like this:</span>\n  <span class=\"token comment\"># </span>\n  <span class=\"token comment\"># HELLO_TARGET=Pluto</span>\n  sops<span class=\"token punctuation\">.</span>secrets<span class=\"token punctuation\">.</span>hello_sayer_envfile <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span> \n    restartUnits <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"container@helloing-container.service\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span> \n\n  containers<span class=\"token punctuation\">.</span>helloing<span class=\"token operator\">-</span>container<span class=\"token punctuation\">.</span>config <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span> <span class=\"token punctuation\">{</span>\n    system<span class=\"token punctuation\">.</span>stateVersion <span class=\"token operator\">=</span> <span class=\"token string\">\"23.05\"</span><span class=\"token punctuation\">;</span>\n\n    systemd<span class=\"token punctuation\">.</span>services<span class=\"token punctuation\">.</span>hello<span class=\"token operator\">-</span>sayer <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n      wantedBy <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"multi-user.target\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n      serviceConfig <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n        <span class=\"token comment\"># We do not use LoadCredential</span>\n        User <span class=\"token operator\">=</span> <span class=\"token string\">\"hello-sayer\"</span><span class=\"token punctuation\">;</span>\n        DynamicUser <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n        EnvironmentFile <span class=\"token operator\">=</span> <span class=\"token string\">\"/run/credentials/@system/hello-sayer-envfile\"</span><span class=\"token punctuation\">;</span>\n      <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n      script <span class=\"token operator\">=</span> <span class=\"token string\">''\n        echo \"👋 Hello, $HELLO_TARGET\"\n      ''</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  containers<span class=\"token punctuation\">.</span>helloing<span class=\"token operator\">-</span>container <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    autoStart <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n    extraFlags <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n      <span class=\"token string\">\"--load-credential=hello-sayer-envfile:<span class=\"token interpolation\"><span class=\"token antiquotation important\">$</span><span class=\"token punctuation\">{</span>config<span class=\"token punctuation\">.</span>sops<span class=\"token punctuation\">.</span>secrets<span class=\"token punctuation\">.</span>hello_sayer_envfile<span class=\"token punctuation\">.</span>path<span class=\"token punctuation\">}</span></span>\"</span>\n    <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption>Supplying secrets via an environment file</figcaption>\n</figure>\n<h2 id=\"in-nixos\">In NixOS</h2>\n<p>A number of modules in NixOS already do use <code>LoadCredential</code>, on options that specify files with secrets. The advantage to doing it this way is that the permissions on the secrets paths do not need to be adjusted, as systemd will take care of presenting the credential to the service with the correct permissions. This is particularly advantageous when passing credentials into containers using <code>--load-credential</code>, as those end up with root-only permissions. Where <code>LoadCredential</code> is not used, we might have to add some extra scripts to copy system-level credentials into files readable by the target service, before service startup; this depends on how a particular NixOS module handles secrets, though.</p>\n<h2 id=\"further-reading\">Further reading</h2>\n<ul>\n<li><a href=\"https://systemd.io/CREDENTIALS/\">Credentials</a> on systemd.io – a more in-depth exploration of systemd's credential facilities</li>\n</ul>\n"},{"id":"https://dee.underscore.world/blog/embedding-fonts-in-svgs/","url":"https://dee.underscore.world/blog/embedding-fonts-in-svgs/","title":"Embedding fonts in SVGs","date_published":"2023-07-23T02:31:08.000Z","date_modified":"2023-07-23T02:31:08.000Z","content_html":"<p>As <a href=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/\">previously established</a>, SVGs are nice. Being a vector format, they display well at varied sizes. Being a vector format, they also have to be rasterized by a renderer before display, which introduces an opportunity for rendering inconsistencies between different platforms.</p>\n<p>One possible source of such rendering inconsistencies is fonts. Just like with HTML, the basic use of a font requires that the font be available to the rendering program locally; if the font is not installed, the renderer may fall back to a different font. This is particularly undesirable if, for example, a piece of text is supposed to fit within a box: if the dimensions of the text change, it could possibly end up going over the box's borders. Therefore, if we want such SVGs to always look correct, we have to figure out how to work around this limitation.</p>\n<h2 id=\"converting-fonts-to-paths\">Converting fonts to paths</h2>\n<p>Font glyphs (at least in a TrueType font) are defined as contours consisting of curves and lines. SVGs can also do paths that consist of curves and lines. It follows that text could be turned into SVG paths.</p>\n<p>Properly rendering Unicode text can be difficult, so we could ask an existing font rendering library to typeset our text properly, give it to us in a vector format, and then turn that entire bit of text into paths. This is how several tools do it, including Inkscape, which can turn text into paths via the <kbd class=\"menu-sequence\"><kbd><samp class=\"menu-entry\">Object</samp></kbd> ▸ <kbd><samp class=\"menu-entry\">Object to Path</samp></kbd></kbd> command.</p>\n<p>The disadvantage of converting the entire text object into paths in this way is that each glyph rendered will be a separate path. Even when a character occurs within the text multiple times, each occurrence will be stored as a separate path in the SVG file. How much of a problem this is depends on how much text there is, so it can get annoying if there is a lot of it. In theory, it should be possible to list individual glyphs in <code>&lt;defs&gt;</code> of an SVG image, and then <code>&lt;use&gt;</code> them repeatedly, but the difficulty with that is the need to hook into a different point of the text rendering path.</p>\n<p>There are other downsides to this approach: when rendered as a document, text in an SVG is selectable, but text-rendered-as-paths is not. Inkscape inserts the original text into an <code>aria-label</code>, so at least accessibility tools can still read it.</p>\n<p>Conversion of text to SVG paths is also lossy. On displays without a high pixel density (we have to assume not everyone is using fancy new high density displays, at least not all the time), font rendering can involve application of <i>hints</i>, that the font file provides for aligning the glyphs to the pixel grid. The font rendering system can also do things like subpixel rendering. As rendering SVG paths usually does not involve such steps, text-as-text and text-as-paths can end up rendered differently:</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<picture>\n<source srcset=\"https://dee.underscore.world/blog/embedding-fonts-in-svgs/beep.webp\" type=\"image/webp\">\n<source srcset=\"https://dee.underscore.world/blog/embedding-fonts-in-svgs/beep.jxl\" type=\"image/jxl\">\n<img src=\"https://dee.underscore.world/blog/embedding-fonts-in-svgs/beep.png\" loading=\"lazy\" alt=\"The word &quot;beep&quot; three times, twice in black, and the third time appearing mostly as a pixelated outline.\" style=\"image-rendering: pixelated\">\n</picture>\n<figcaption>The word &quot;beep&quot; was inserted into a nominally 1000×1000 px SVG image twice, with the first one staying as text, and the second one turned to path by Inkscape. The image was then rendered in Firefox at 100×100 px. The third &quot;beep&quot; is the difference between the first two. The typeface is Bitter.</figcaption>\n</figure>\n<p>Nevertheless, this approach is expedient, and works fairy well for, say, text that is part of a logo-like design. But, perhaps we can find a different solution...</p>\n<h2 id=\"fetching-fonts\">Fetching fonts</h2>\n<p>CSS-as-used-in-HTML supports the <code>@font-face</code> at-rule, which can be used to tell the browser where to download a font. The font can then be used as if it was installed locally—the browser will download it when it needs to, and use it to render text in the document. A rather minimal example looks something like this:</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-css\"><code class=\"language-css\"><span class=\"token atrule\"><span class=\"token rule\">@font-face</span></span> <span class=\"token punctuation\">{</span>\n  <span class=\"token property\">font-family</span><span class=\"token punctuation\">:</span> <span class=\"token string\">\"Comic Neue\"</span><span class=\"token punctuation\">;</span>\n  <span class=\"token property\">font-style</span><span class=\"token punctuation\">:</span> normal<span class=\"token punctuation\">;</span>\n  <span class=\"token property\">font-weight</span><span class=\"token punctuation\">:</span> normal<span class=\"token punctuation\">;</span>\n  <span class=\"token property\">src</span><span class=\"token punctuation\">:</span> <span class=\"token url\"><span class=\"token function\">url</span><span class=\"token punctuation\">(</span><span class=\"token string url\">\"/ComicNeue-Regular.woff2\"</span><span class=\"token punctuation\">)</span></span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span>\n\n<span class=\"token selector\">body</span> <span class=\"token punctuation\">{</span>\n  <span class=\"token property\">font-family</span><span class=\"token punctuation\">:</span> <span class=\"token string\">\"Comic Neue\"</span><span class=\"token punctuation\">,</span> sans-serif<span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span></code></pre>\n<figcaption>A stylesheet declaring a <code>@font-face</code> rule, which tells the browser that the Comic Neue font in normal weight and style is available for download under a relative URL. The browser may consult this rule when it encounters Comic Neue in the <code>font-family</code> property for <code>body</code>.</figcaption>\n</figure>\n<p><code>@font-face</code> is also available with CSS-as-used-in-SVG, but there is an obstacle to using it when showing SVGs in the browser: when used as images, SVGs cannot download external resources.</p>\n<p>There are several ways to use SVGs with HTML documents, like <code>&lt;img&gt;</code> elements, <code>&lt;object&gt;</code> elements, inlining, or use as CSS images. The uses fall into two broad categories: one where the SVG is treated like an image, and one where the SVG is treated like an XML document. When an SVG is included via say, an <code>&lt;object&gt;</code> element, it is treated like an XML document, and so it can do stuff like download external resources or run embedded JavaScript. On the other hand, when an SVG is included via an <code>&lt;img&gt;</code> element, or used for background via CSS, it is treated like an image, and restricted from accessing external resources, running scripts, or providing interactivity—it acts like a raster image would. As such, use of external resources in <code>@font-face</code> limits how we can use our SVGs.</p>\n<p>As an aside, these limitations mean that viewing an untrusted SVG included via an <code>&lt;img&gt;</code> tag—such as on a website that allows user-uploaded media—will not result in immediate disaster. When it is viewed this way, scripts inside the SVG will not run at all. However, SVGs viewed in a tab by themselves <em>do</em> run as XML documents. With the user-uploaded media example, an attacker could thus link to the SVG file directly in order to get someone to run malicious scripts in the context of the domain where the SVG was uploaded. Furthermore, the browser may expose an <kbd class=\"menu-sequence\"><samp class=\"menu-entry\">Open Image in New Tab</samp></kbd> option in the right click menu for an SVG in <code>&lt;img&gt;</code>, which provides a way for the user to wander into running the SVG's scripts.</p>\n<h2 id=\"embedding-fonts\">Embedding fonts</h2>\n<p>The CSS <code>url()</code> function can accept data URLs. As a data URL does not require accessing an external resource, we can use it to include a font in a way that will work even when the SVG is included via an <code>&lt;img&gt;</code> element. As a bonus, we end up with an SVG that is self-contained.</p>\n<p>In order to put our font file in a <code>data:</code> URL, we have to encode it with base64, which introduces 33% of overhead when it comes to file size. Most modern browsers support WOFF2—a compressed font format designed specifically for transferring fonts to browsers—so we can offset the space requirement somewhat. It would look something like this:</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-svg\"><code class=\"language-svg\"><span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>svg</span>\n  <span class=\"token attr-name\">viewBox</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>0 0 100 100<span class=\"token punctuation\">\"</span></span>\n  <span class=\"token attr-name\">xmlns</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>http://www.w3.org/2000/svg<span class=\"token punctuation\">\"</span></span>\n<span class=\"token punctuation\">></span></span>\n  <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>style</span><span class=\"token punctuation\">></span></span><span class=\"token style\"><span class=\"token language-css\">\n    <span class=\"token atrule\"><span class=\"token rule\">@font-face</span></span> <span class=\"token punctuation\">{</span>\n      <span class=\"token property\">font-family</span><span class=\"token punctuation\">:</span> <span class=\"token string\">\"Comic Neue\"</span><span class=\"token punctuation\">;</span>\n      <span class=\"token property\">font-style</span><span class=\"token punctuation\">:</span> normal<span class=\"token punctuation\">;</span>\n      <span class=\"token property\">font-weight</span><span class=\"token punctuation\">:</span> normal<span class=\"token punctuation\">;</span>\n      <span class=\"token property\">src</span><span class=\"token punctuation\">:</span> <span class=\"token url\"><span class=\"token function\">url</span><span class=\"token punctuation\">(</span><span class=\"token string url\">\"data:font/woff2;base64,d09GMgABAAAAAFfQAA8…\"</span><span class=\"token punctuation\">)</span></span><span class=\"token punctuation\">;</span> <span class=\"token comment\">/* full, long base64 omitted */</span>\n    <span class=\"token punctuation\">}</span>\n    <span class=\"token selector\">text</span> <span class=\"token punctuation\">{</span>\n      <span class=\"token property\">font-family</span><span class=\"token punctuation\">:</span> <span class=\"token string\">\"Comic Neue\"</span><span class=\"token punctuation\">,</span> sans-serif<span class=\"token punctuation\">;</span>\n      <span class=\"token property\">font-size</span><span class=\"token punctuation\">:</span> 20px<span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span>\n  </span></span><span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>style</span><span class=\"token punctuation\">></span></span>\n  <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>text</span> <span class=\"token attr-name\">x</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>0<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">y</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>50<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>beep boop<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>text</span><span class=\"token punctuation\">></span></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>svg</span><span class=\"token punctuation\">></span></span></code></pre>\n<figcaption>An SVG embedding a font in it, and using it to render &quot;beep boop&quot;.</figcaption>\n</figure>\n<h2 id=\"subsetting-fonts\">Subsetting fonts</h2>\n<p>The problem with embedding a whole font in our SVG file is that we are embedding the <em>whole</em> font. The font file may include glyphs which are not going to be used in rendering the image at all. If the text we are including only ever makes use of a subset of all the available glyphs in the font, we could drop all the unused glyphs, to save on space. This technique is commonly used with PDFs (which can also embed fonts), but we can also bring it to SVGs.</p>\n<p>The <a href=\"https://github.com/fonttools/fonttools\" title=\"fontTools repository\">fontTools</a> package contains a <code>subset</code> module, which can be used for producing subsets of fonts. Properly producing a font subset can be tricky, as we have to make sure we do not omit any of the data required to render the desired text (which can include more than just the obvious glyphs), but fortunately the subsetting tool's default settings work well here. fontTools can also output a compressed WOFF2 file, although this requires fontTools with the <code>woff</code> extra. It works like this:</p>\n<figure class=\"figwide\" id=\"listing-3\">\n<pre class=\"language-svg\"><code class=\"language-svg\">$ pyftsubset --text='beep boop' --output-file=ComicNeue-Regular-subset.woff2 --flavor=woff2 ComicNeue-Regular.otf\n$ base64 -w0 ComicNeue-Regular-subset.woff2\nd09GMk9UVE8AAAS4AAwAAAAAB3QAAARsAAIAxAAAAA…</code></pre>\n<figcaption>Subsetting a font. The full font has been, once again, omitted.</figcaption>\n</figure>\n<p>Extracting the text from the SVG image can be the tricky part. It is easy enough to specify &quot;beep boop&quot; on the command line, if that's all our image contains. <code>pyftsubset</code> can also read the text from a plain text file, but we have to actually extract the text from the SVG first. As a quick hack, we could just copy and paste the text into a file, and feed that to <code>pyftsubset</code>. More properly, the solution would involve walking the XML tree, finding the text elements, copying their contents, and dumping that out. This process has been automated, with tools like <a href=\"https://git.dzx.fr/gdzx/svgoptim/\" title=\"svgoptim repository\">svgoptim</a>, which I have not, however, successfully tested (and the author of which has a <a href=\"https://dzx.fr/blog/svg-font-embedding/\" title=\"Embedding and subsetting external fonts inside SVG images\">blog post</a> much like this one, but better). In the end, the result is something like this:</p>\n<figure class=\"figwide\" id=\"figure-2\">\n<img src=\"https://dee.underscore.world/blog/embedding-fonts-in-svgs/beep-subsetted.svg\" loading=\"lazy\" alt=\"The words &quot;beep boop&quot;\" style=\"aspect-ratio: 1/1;max-width: 400px;\">\n<figcaption>An SVG image, using embedded subset of a font to render text</figcaption>\n</figure>\n<h2 id=\"which-method-to-use\">Which method to use</h2>\n<p>For short headline bits of text, or text incorporated into designs like logos, turning it into SVG paths is generally an expedient way to get decent results. For things like multiple longer diagram labels, or stretches of body text, embedding a subsetted WOFF2 file offers a portable SVG minimized filesize requirements, which is usable in <code>&lt;img&gt;</code> elements. There is also the option of not using any of these techniques, if the design can accept the possibility of substitute fonts being used, and that possibility is not objectionable.</p>\n<h2 id=\"further-stuff\">Further stuff</h2>\n<ul>\n<li><a href=\"https://comicneue.com/\">Comic Neue</a> – The font used in the examples above</li>\n</ul>\n"},{"id":"https://dee.underscore.world/blog/embrace-extend-then-what/","url":"https://dee.underscore.world/blog/embrace-extend-then-what/","title":"Embrace, extend, and then what?","date_published":"2023-07-02T20:49:03.000Z","date_modified":"2023-07-02T20:49:03.000Z","content_html":"<p>The Fediverse has been around for a while. Exactly how long depends on what one counts as <i>the Fediverse</i>, but ActivityPub—the protocol now used by perhaps the best-known Fediverse software, Mastodon—is over five years old.</p>\n<p>The Fediverse has historically mainly involved non-commercial entities. Indeed, for many of its users, its non-commercial nature is a major part of the appeal. Recently, though, corporations including the likes of Facebook have started expressing more interest in ActivityPub. Twitter's recently accelerated deterioration under Elon Musk is likely to blame for this, as the Fediverse (or rather, Mastodon) has been more widely covered as an alternative to the ailing social network. If Twitter does collapse—or turn entirely into just another Internet pit of fascism—then a vacuum will emerge. Moving into that vacuum is of interest to the other big corporate players, and the possibility that ActivityPub will be relevant for doing so explains the attention that it is now getting.</p>\n<p>Parts of the Fediverse have expressed concern about the possible entry of corporate interests into the space. These parts posit that introduction of corporate, profit-driven entities is generally not desirable in spaces that are managed by community, and <em>for</em> the community. A concept that gets brought up in these discussions, and the thing that people fear will happen if corporate encroachment is not resisted, is <i>embrace, extend, extinguish</i>, or, with more penchant for the dramatic, <i>embrace, extend, exterminate</i> (<i>EEE</i> for short). But, what is it, and how likely is it to happen?</p>\n<h2 id=\"the-original-embrace%2C-extend%2C-extinguish\">The original <i>embrace, extend, extinguish</i></h2>\n<p>In the ancient days of 1990s, before the inception of Google or Facebook, tech's prototypical giant evil corporation was Microsoft. Like today, Windows was dominant on the desktop, and unlike today, desktop was how most end-users did computers.</p>\n<p>Microsoft of that era engaged in many practices aimed at locking users into its products. An example is Microsoft's approach to Java. At the time, Java was a hyped new technology, as it promised a platform which could be used for distributing software (applets) over the Internet, while enabling developers to target the Java Virtual Machine, instead of the underlying operating system. Microsoft, indeed, embraced Java—there was even a Microsoft Java Virtual Machine.</p>\n<p>Microsoft also had concerns. Code written for the Java VM was not tied to Windows, and the Java VM could run on multiple operating systems. This meant that users were now less locked into Windows. To address this, Microsoft extended their Java VM with an API that allowed calling into the Windows API from Java. The standard Java platform contained abstractions over the operating system's API, which meant that code written for the standard Java platform was portable. The Microsoft extensions worked only with Windows, so anyone who wrote Java code that used the extensions was writing code that, despite being Java, would only work on Microsoft's operating systems. Sun Microsystems—creators of Java and holders of Java trademarks and copyrights—sued Microsoft over this, and eventually reached a settlement. Ultimately, Microsoft discontinued their own version of the Java VM.</p>\n<p>In the 1990s, the World Wide Web was an emergent technology that everyone—including Microsoft—wanted to get in on. Bill Gates, who at that point still took an active role in setting the direction of Microsoft, publicly declared the intention to &quot;embrace and extend&quot; the related standards. Microsoft's actions in the area, however, were a source of even more legal issues for the corporation, this time from American antitrust authorities. Towards the end of the 1990s and in the early 2000s, Microsoft was subject to protracted legal proceedings, brought by the United States Department of Justice.</p>\n<p>It was during one phase of this <i>United States v. Microsoft Corp</i> trial that an Intel executive, Steven McGeady, was put on the witness stand. In his testimony, he recalled a 1995 meeting between Intel and Microsoft, on the two companies' involvement in the development of the Internet. According to McGeady, during that meeting, Paul Maritz—a Microsoft Executive—said that it was Microsoft's strategy to &quot;embrace, extend, extinguish&quot; Internet standards.</p>\n<p>Soon, <i>embrace, extend, extinguish</i> became a popular, pithy descriptor of Microsoft's strategy, and a way to criticize it. Microsoft, indeed, embraced web standards, extended them with proprietary additions, and attempted to extinguish any competition. Evidence submitted during the antitrust trial included, for example, late 1990s memos, in which Bill Gates discussed the fact that browser-based viewers for Microsoft Office documents worked in non-Microsoft web browsers. Gates believed that this hurt the position of Windows on the market. &quot;We have to stop putting any effort into this and make sure that Office documents very well depends on PROPRIETARY IE capabilities&quot;, he urged.</p>\n<p>It took until the mid-2000s for the dominance of Internet Explorer to wane. New browsers—like Firefox—offered a better user experience in many areas, which meant that many users preferred them to IE. Internet Explorer also continued with its own idiosyncrasies when it came to adhering to web standards. Web developers could either produce websites which worked in every other browser while being broken in IE, or do extra work to also make them work in IE. The fatigue with this situation, combined with later entry of Google into the browser market with Chrome, meant that Microsoft was no longer able to deploy its EEE ways to the extent that it could before.</p>\n<p>Before that, EEE did work, at least for a while. In 2023, Microsoft Windows still remains the most popular operating system for desktop and laptop computers. Microsoft's early efforts at locking their position in the market are likely a factor here, even if modern Microsoft's EEE efforts do not have the same overtness and intensity as they did during the 1990s.</p>\n<p>And, indeed, Microsoft of today likes to present itself as a company more interested in embracing open standards without the old Microsoft's ulterior motives. It is still a corporation, and like all corporations it does not do things out of the kindness of its corporate heart. People have, however, grown more wise to the EEE ways over the intervening decades, and so blatant attempts at locking them in are more likely to be rejected, in favor of more open platforms. Lock-in requires more subtlety now.</p>\n<h2 id=\"gmail\">Gmail</h2>\n<p>In 2004, Google announced that they would start offering an email service. The available storage for email would be 1 GB, which was an impressive amount at the time. Per-gigabyte cost of storage was in the single digits of US dollars at the time (in 2023, hard drive storage per-gigabyte cost is under 0.10 USD). Seeing gigabytes of stuff was not <em>unfathomable</em> to the people of 2004, but free email services tended to offer inboxes of maybe a dozen <em>megabytes</em>. On those services, it was expected old email would be regularly deleted to make room for new incoming email. Gmail, on the other hand, promised you would not have to delete your emails again.</p>\n<p>Gmail also offered a spiffy Web client for reading and sending email. By 2004, webmail had been a thing for a while, but Gmail did webmail in a way that more resembled a modern web app of later years. JavaScript-based interfaces, which could load new data without reloading the whole page, were still a fairly new trend. Gmail's web app was a cool new thing, compared to other free webmail offerings. Google also offered other perks, like free POP access (which is an older standard for accessing email from a desktop email client), at a time when other free email providers used a strategy of limiting free features to get people to subscribe to an expanded, paid offering.</p>\n<p>Immediately after it was first announced, Gmail became the hot new thing. During the initial rollout, when Gmail was invite-only, people were actually willing to pay money for an invite (which is the type of hype some startups still try to recreate in 2023). Throughout the 19 years since Gmail's start, its free offering remained ahead of those of other free providers. For a lot of people in need of a free email account, Gmail has thus established itself as the default option. It may be the most popular email service overall, although the exact numbers are hard to figure out, since we cannot see <em>all</em> the email use out there.</p>\n<p>The dominance of Gmail does come with some downsides. As email spam remains a persistent problem, Gmail offers spam filtering. This means that they reject some of the emails that are sent to their servers, often according to criteria that are opaque to the senders. When your own small email server gets consistently rejected by another, small server, it is a small problem; when your server gets rejected by Gmail, it is a comparatively larger problem. Being rejected by Gmail's servers for unclear reasons, despite not originating any spam traffic is something that indeed happens to people trying to host their own email. Being denied the ability to talk any Gmail user means being denied the ability to talk to quite a lot of people. Gmail, thus, has an outsize influence over who can manage to effectively self-host email, and indirectly controls the market of other service providers.</p>\n<h2 id=\"xmpp\">XMPP</h2>\n<p>The <dfn>Extensible Messaging and Presence Protocol</dfn> (XMPP) is an instant messaging protocol which first emerged (under the name <i>Jabber</i>) in the late 1990s. Instant messaging (IM) software already existed at the time, even if modern smartphones did not. IM programs were generally used by someone at a desktop or laptop computer to chat, via text, in real time, with other people on their desktop or laptop computers. <i>Presence</i> is part of the protocol's name, because a user would have to know who else is also online at the same time, and thus available for chatting.</p>\n<p>XMPP was an alternative to other, proprietary services. ICQ, Microsoft's MSN Messenger, and AOL's AIM were some of the proprietary IM services enjoyed mainstream popularity through the 2000s. XMPP was, however, the decentralized, open source alternative to the proprietary IM services that put their traffic through a corporation's servers. With XMPP, anyone could host a server, and XMPP servers could talk to each other using a standard protocol, allowing users on one server to send instant messages to users on other XMPP servers. XMPP could be also be used without federation, making it useful as basis for things like internal communication tools. It is easier to use an existing protocol, with available software, than to invent something from scratch.</p>\n<p>Indeed, Google chose XMPP as the backing protocol for their Google Talk software. Google Talk first appeared in mid-2005, and although it used XMPP, initially Google Talk users could only chat with other Google Talk users. Google turned open federation on in early 2006, enabling users on other XMPP servers and Talk users to communicate with each other.</p>\n<p>Google Talk incorporated a bunch of Google-specific extensions to XMPP (which is, after all, extensible). Standard XMPP clients could be used with Google Talk, but the added features would not always work. Some extensions, though, were eventually standardized: Jingle, for example, started as a Talk extension used for establishing voice calls, but eventually had a standard specification published, and is now part of mainline XMPP clients.</p>\n<p>As is often the case with Google, Talk was eventually discontinued. In the early 2010s, Google decided to move away from Talk and to Google Hangouts (which was <em>not</em> based XMPP). To that end, they stopped federating their XMPP servers, discontinued the various desktop and mobile apps, and removed Talk widgets from Google web apps. The servers themselves remained up, and reachable through third-party XMPP clients all the way up until 2022. By 2022, Google was actually moving away from Hangouts, and to other messaging apps.</p>\n<p>One can, of course, still use XMPP today. It remains an open standard. The XMPP Standards Foundation still maintains extension standards (called XEPs). There still is maintained server and client software, that runs on modern platforms. Like with email, the user counts are unclear, since there is not a central authority with an overview of all XMPP users, but there are at least some users out there.</p>\n<h2 id=\"was-it-eee%3F\">Was it EEE?</h2>\n<p>Microsoft's strategy in the 1990s and early 2000s was clear: if something is open, add proprietary bits to ensure vendor lock-in, and thus maintain ongoing monopoly and market domination. Was Google's strategy the same? Did Google seek to exterminate email? Did they purposefully kill XMPP?</p>\n<p>One interpretation is that over the 2000s, corporations moved away from <i>embrace, extend, extinguish</i> strategies. After all, EEE did get Microsoft into frequent legal trouble. People became aware of EEE, and rather than getting locked into a piece of software, would seek out alternative solutions that adhere to standards. Market domination is still possible under these conditions, but requires different approaches. Google, for example, didn't need to lock people into their email offering with proprietary extensions, because email addresses are already a form of lock-in by themselves—having to tell everyone you've switched your email address is annoying. Google's size means it can offer more for free as a loss-leader, and achieve domination that way.</p>\n<p>Another interpretation is that corporations have gotten <em>sneakier</em>. You cannot pull a Microsoft anymore—all the coverage of Microsoft's EEE practices in the past means that people know what that looks like. Instead, you have to <em>appear</em> to embrace open standards without ulterior motives, while slipping in subtle incompatibilities. Large corporations with large user bases can dictate how a standard goes—if they move in one direction, everyone has to follow, or be left behind. When Google came out with Jingle, everyone had to get on board; Google, on the other hand, did not have to get on board with anyone else's extensions if it did not want to. Such dominance grants control, which can be used to extinguish a standard, without having to go for blatant lock-in.</p>\n<p>Does it ultimately matter, though? One could argue that Google killed XMPP by endorsing it at first and then pulling out of the space, taking all the users with it. One could also argue that XMPP was an obscure chat protocol for a bunch nerds, designed for an older mode of communication, that saw several million users come in and then leave, putting it back where it started. One could argue whether Google intentionally killed XMPP in order to eliminate open competition, and push for its own IM platforms, or if Google simply boldly steered its bulk into the space, without caring who gets caught in the wake. But, does it matter? The results are what they are.</p>\n<h2 id=\"the-future-of-the-fediverse\">The future of the Fediverse</h2>\n<p>When rumors and reports of the possible involvement of Facebook (allegedly properly known as Meta) in the Fediverse began to circulate on the Fediverse, one of the concerns brought up was that the corporation's intentions were underhanded. The fear was that Facebook wanted to EEE the Fediverse.</p>\n<p>In the end, though, it does not really matter if a corporation like Meta enters the Fediverse with intentions that are actively malicious, or not.</p>\n<p>One might be tempted by the prospects that corporate involvement is going to lead to more work being done on the protocol and associated software, for the benefit of all. For those who care about there being more people on the Fediverse, welcoming a large corporation also sounds like a good way to bring in more users.</p>\n<p>Corporations are, however, interested in mutually beneficial arrangements only because of the side of the arrangement that is beneficial <em>for the corporation</em>. When a corporation promises benefits to the community in exchange for the community letting it in, those benefits are only side effects that the corporation is willing to put forward as incentives. Profit is the primary goal of the corporation, and anything it does for the community is in pursuit of that.</p>\n<p>Immediate, outright blocking of any attempt by Facebook to enter the Fediverse may seem like an excessive reaction; it is, however, an understandable one. Rejecting corporate entry into the space signals that, at least in that part of the Fediverse, corporate interests are not welcome, and that community is valued more than increased growth or mainstream acceptance. Principles of open federation that require unconditionally admitting all corporate actors may seem ideologically desirable, on some abstract level. On a slightly less abstract level, they potentially require admitting an actor that is not going to act in good faith, and will be harmful to the community—a thing that is, from an ideological perspective, undesirable.</p>\n<p>Perhaps Meta would not be able to kill the Fediverse anyway. Large portions of the Fediverse, after all, were made by, and are populated by people who wanted to get away from corporate social media. The Fediverse is evidence that moving away from corporate social media is possible. Perhaps Facebook, when allowed to run free, would at most capture a portion of the Fediverse, find the lack of control limiting, find the opportunities for profit lacking, and leave by itself. Rejecting Meta outright from the start saves the Fediverse all that trouble, though.</p>\n<h2 id=\"more-information\">More information</h2>\n<p>Websites about excluding Facebook/Meta from the Fediverse:</p>\n<ul>\n<li><a href=\"https://fedipact.online/\">Fedi Pact</a></li>\n<li><a href=\"https://misfitloserzealot.club/\">Misfit Loser Zealot Club</a></li>\n</ul>\n<h2 id=\"sources\">Sources</h2>\n<ul>\n<li id=\"ref-bradshaw_google_2022\" class=\"csl-entry\" role=\"doc-biblioentry\">\nBradshaw, Kyle. “Google Talk, from 2005, Will Shut\ndown for Good Later This Week.” <cite>9to5Google</cite>, June 2022.\n<a href=\"https://9to5google.com/2022/06/13/google-talk-shutting-down-xmpp/\">https://9to5google.com/2022/06/13/google-talk-shutting-down-xmpp/</a>.\n</li>\n<li id=\"ref-dave_cridland_google_nodate\" class=\"csl-entry\" role=\"doc-biblioentry\">\nDave Cridland. “Google: ’The Future Is\nJingle’.” <cite>The XMPP Blog</cite>. Accessed June 27,\n2023. <a href=\"https://xmpp.org/2011/06/google-the-future-is-jingle/\">https://xmpp.org/2011/06/google-the-future-is-jingle/</a>.\n</li>\n<li id=\"ref-higgins_google_2013\" class=\"csl-entry\" role=\"doc-biblioentry\">\nHiggins, Parker. “Google Abandons Open\nStandards for Instant\nMessaging.” <cite>Electronic Frontier Foundation</cite>,\nMay 2013. <a href=\"https://www.eff.org/deeplinks/2013/05/google-abandons-open-standards-instant-messaging\">https://www.eff.org/deeplinks/2013/05/google-abandons-open-standards-instant-messaging</a>.\n</li>\n<li id=\"ref-lea_usa_1998\" class=\"csl-entry\" role=\"doc-biblioentry\">\nLea, Graham. “USA Versus Microsoft: The\nFourth Week.” <cite>BBC News</cite>, December 1998. <a href=\"http://news.bbc.co.uk/1/hi/special_report/1998/04/98/microsoft/215645.stm\">http://news.bbc.co.uk/1/hi/special_report/1998/04/98/microsoft/215645.stm</a>.\n</li>\n<li id=\"ref-markoff_tomorrow_1996\" class=\"csl-entry\" role=\"doc-biblioentry\">\nMarkoff, John. “Tomorrow, the World Wide\nWeb!;Microsoft, the PC\nKing, Wants to Reign\nOver the Internet.” <cite>The New York\nTimes</cite>, July 1996. <a href=\"https://www.nytimes.com/1996/07/16/business/tomorrow-world-wide-web-microsoft-pc-king-wants-reign-over-internet.html\">https://www.nytimes.com/1996/07/16/business/tomorrow-world-wide-web-microsoft-pc-king-wants-reign-over-internet.html</a>.\n</li>\n<li id=\"ref-noauthor_microsofts_1998\" class=\"csl-entry\" role=\"doc-biblioentry\">\n“Microsoft’s Holy War on Java.” <cite>CNET</cite>,\nSeptember 1998. <a href=\"https://web.archive.org/web/20130116214907/http://news.cnet.com/2009-1001-215854.html\">https://web.archive.org/web/20130116214907/http://news.cnet.com/2009-1001-215854.html</a>.\n</li>\n<li id=\"ref-patringenaru_whos_2021\" class=\"csl-entry\" role=\"doc-biblioentry\">\nPatringenaru, Ioana. “Who’s Got Your Mail? Google and\nMicrosoft, Mostly.” <cite>UC San Diego Today</cite>,\nDecember 2021. <a href=\"https://today.ucsd.edu/story/IMC2021_savage\">https://today.ucsd.edu/story/IMC2021_savage</a>.\n</li>\n<li id=\"ref-pierce_can_2023\" class=\"csl-entry\" role=\"doc-biblioentry\">\nPierce, David. “Can ActivityPub Save the\nInternet?” <cite>The Verge</cite>, April 2023. <a href=\"https://www.theverge.com/2023/4/20/23689570/activitypub-protocol-standard-social-network\">https://www.theverge.com/2023/4/20/23689570/activitypub-protocol-standard-social-network</a>.\n</li>\n<li id=\"ref-rajiv_chandrasekaran_us_1998\" class=\"csl-entry\" role=\"doc-biblioentry\">\nRajiv Chandrasekaran. “U.S. V. Microsoft\nSpecial Report.” <cite>Washington\nPost</cite>, November 1998. <a href=\"https://www.washingtonpost.com/wp-srv/business/longterm/microsoft/stories/1998/microsoft111398.htm\">https://www.washingtonpost.com/wp-srv/business/longterm/microsoft/stories/1998/microsoft111398.htm</a>.\n</li>\n<li id=\"ref-roy_google_2013\" class=\"csl-entry\" role=\"doc-biblioentry\">\nRoy, Hugo. “Google Talk Discontinued.”\n<cite>With/in the FSFE</cite>, May 2013. <a href=\"https://blogs.fsfe.org/hugo/2013/05/google-talk-discontinued-will-google-keep-its-promise-and-give-xmpp-users-a-way-out/\">https://blogs.fsfe.org/hugo/2013/05/google-talk-discontinued-will-google-keep-its-promise-and-give-xmpp-users-a-way-out/</a>.\n</li>\n<li id=\"ref-shankland_sun_2002\" class=\"csl-entry\" role=\"doc-biblioentry\">\nShankland, Stephen. “Sun, Microsoft Settle\nJava Suit.” <cite>CNET</cite>, March 2002. <a href=\"https://www.cnet.com/tech/tech-industry/sun-microsoft-settle-java-suit/\">https://www.cnet.com/tech/tech-industry/sun-microsoft-settle-java-suit/</a>.\n</li>\n<li id=\"ref-vaughan-nichols_google_2013\" class=\"csl-entry\" role=\"doc-biblioentry\">\nVaughan-Nichols, Steven. “Google Moves Away from the\nXMPP Open-Messaging Standard.” <cite>ZDNET</cite>, May\n2013. <a href=\"https://www.zdnet.com/home-and-office/networking/google-moves-away-from-the-xmpp-open-messaging-standard/\">https://www.zdnet.com/home-and-office/networking/google-moves-away-from-the-xmpp-open-messaging-standard/</a>.\n</li>\n<li id=\"ref-noauthor_windows_2002\" class=\"csl-entry\" role=\"doc-biblioentry\">\n“Windows 95 Remains Most Popular Operating System.”\n<cite>CNET</cite>, January 2002. <a href=\"https://www.cnet.com/culture/windows-95-remains-most-popular-operating-system/\">https://www.cnet.com/culture/windows-95-remains-most-popular-operating-system/</a>.\n</li>\n</ul>\n"},{"id":"https://dee.underscore.world/blog/svg-light-dark-color-schemes/","url":"https://dee.underscore.world/blog/svg-light-dark-color-schemes/","title":"Light and dark color schemes with SVGs","date_published":"2023-05-24T14:52:36.000Z","date_modified":"2023-05-24T14:52:36.000Z","content_html":"<p>When doing diagrams of various sorts for blog posts, I like to use the SVG format. Vector images work well for diagrams, since diagrams involve simple shapes, lines, and text, and those are all things that vector graphics are good at. The SVG (Scalable Vector Graphics) format is—at least at the basic level—widely supported by modern browsers, making it a good choice.</p>\n<p>Internally SVGs are XML documents, and, when viewed in a Web browser, there are ways in which they behave similarly to HTML documents. For one, SVGs support CSS—not in the same exact way as HTML, since SVG elements and HTML elements are not the same—but in ways that are often similar. One feature of CSS-as-used-in-HTML that CSS-as-used-in-SVG supports is <code>prefers-color-scheme</code> media queries. This means that SVGs can adjust to dark and light schemes with CSS alone. This actually works—at least in Firefox and Chromium—even when the SVG is included in a document via an <code>&lt;img&gt;</code> tag (which otherwise limits what SVGs can do). An example SVG image that makes use of this follows:</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<img src=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/sea.svg\" alt=\"A simple drawing of sea and sky. There is either a sun or a moon and stars in the sky.\" style=\"max-height: 25rem\">\n<figcaption>A simple SVG drawing which, by default, depicts a day scene; if <code>prefers-color-scheme</code> is <code>dark</code>, though, the scene depicted is at night</figcaption>\n</figure>\n<h2 id=\"the-end-goal\">The end goal</h2>\n<p>Diagrams tend to feature elements with consistent styling: boxes with the same kind of border, text labels with the same font, arrows with the same line thickness. In this situation—just like with HTML—it makes sense to give SVG elements semantic class names, or at least names that reflect reusable styles, and then write CSS rules based on those classes. A (non-diagram) example of this:</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-svg\"><code class=\"language-svg\"><span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>svg</span>\n  <span class=\"token attr-name\">viewBox</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>0 0 100 100<span class=\"token punctuation\">\"</span></span>\n  <span class=\"token attr-name\">xmlns</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>http://www.w3.org/2000/svg<span class=\"token punctuation\">\"</span></span>\n<span class=\"token punctuation\">></span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>style</span><span class=\"token punctuation\">></span></span><span class=\"token style\"><span class=\"token language-css\">\n        <span class=\"token selector\">.sky</span> <span class=\"token punctuation\">{</span>\n            <span class=\"token property\">fill</span><span class=\"token punctuation\">:</span> skyblue<span class=\"token punctuation\">;</span>\n        <span class=\"token punctuation\">}</span>\n\n        <span class=\"token selector\">.sea</span> <span class=\"token punctuation\">{</span>\n            <span class=\"token property\">fill</span><span class=\"token punctuation\">:</span> dodgerblue<span class=\"token punctuation\">;</span>\n        <span class=\"token punctuation\">}</span>\n\n        <span class=\"token atrule\"><span class=\"token rule\">@media</span> <span class=\"token punctuation\">(</span><span class=\"token property\">prefers-color-scheme</span><span class=\"token punctuation\">:</span> dark<span class=\"token punctuation\">)</span></span> <span class=\"token punctuation\">{</span>\n            <span class=\"token selector\">.sky</span> <span class=\"token punctuation\">{</span>\n                <span class=\"token property\">fill</span><span class=\"token punctuation\">:</span> midnightblue<span class=\"token punctuation\">;</span>\n            <span class=\"token punctuation\">}</span>\n\n            <span class=\"token selector\">.sea</span> <span class=\"token punctuation\">{</span>\n                <span class=\"token property\">fill</span><span class=\"token punctuation\">:</span> darkblue<span class=\"token punctuation\">;</span>\n            <span class=\"token punctuation\">}</span>\n        <span class=\"token punctuation\">}</span>\n    </span></span><span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>style</span><span class=\"token punctuation\">></span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>rect</span> <span class=\"token attr-name\">class</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>sky<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">x</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>0<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">y</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>0<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">width</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>100%<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">height</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>100%<span class=\"token punctuation\">\"</span></span> <span class=\"token punctuation\">/></span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>rect</span> <span class=\"token attr-name\">class</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>sea<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">x</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>0<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">y</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>65<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">width</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>100%<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">height</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>35<span class=\"token punctuation\">\"</span></span> <span class=\"token punctuation\">/></span></span>\n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>svg</span><span class=\"token punctuation\">></span></span></code></pre>\n<figcaption>A simplified version of <a href=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/#figure-1\">the seascape SVG</a>. The sky and sea have classes assigned, and a separate <code>&lt;style&gt;</code> element assigns colors to them based on user agent's preferred color scheme.</figcaption>\n</figure>\n<p>Authoring SVGs entirely by hand is, however, a tricky proposition. There are tools like <a href=\"https://mermaid.js.org/\" title=\"Mermaid official website\">Mermaid</a> which will, indeed, output SVGs that use classes and separate stylesheets for easier restyling. However, I like to author my SVGs with <a href=\"https://inkscape.org/\" title=\"Official Inkscape website\">Inkscape</a>, as it is a full vector image editor, and so it offers more flexibility than tools designed for authoring specific kinds of diagrams.</p>\n<h2 id=\"inkscape-and-svg-sources\">Inkscape and SVG sources</h2>\n<p>Inkscape is a <i>What You See Is What You Get</i> editor. Because of this, it does <em>not</em> produce output where objects have consistent classes and where stylesheets are easy to apply. Indeed, it is is not really designed with editing the XML source of an SVG as part of the most common workflow.</p>\n<p>Nevertheless, Inkscape actually <em>does</em> feature an XML source editor. The XML editor panel is available under the <kbd class=\"menu-sequence\"><kbd><samp class=\"menu-entry\">Edit</samp></kbd> ▸ <kbd><samp class=\"menu-entry\">XML Editor…</samp></kbd></kbd> menu entry.</p>\n<figure class=\"figwide\" id=\"figure-2\">\n<picture>\n<source srcset=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/xml-editor.webp\" type=\"image/webp\">\n<source srcset=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/xml-editor.jxl\" type=\"image/jxl\">\n<img src=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/xml-editor.png\" loading=\"lazy\" alt=\"Screenshot of the XML editor panel, consisting of one pane with a tree widget showing the XML hierarchy of the document, and another pane showing the properties for the element selected in the first pane.\">\n</picture>\n<figcaption>The XML editor inside Inkscape</figcaption>\n</figure>\n<p>Inkscape also has a newer CSS editor, which can be accessed via  <kbd class=\"menu-sequence\"><kbd><samp class=\"menu-entry\">Objects</samp></kbd> ▸ <kbd><samp class=\"menu-entry\">Selectors and CSS…</samp></kbd></kbd>. This editor somewhat resembles the element inspector you might find in a web browser, and actually allows assigning classes to elements, as well as adding CSS rules based on class selectors.</p>\n<figure class=\"figwide\" id=\"figure-3\">\n<picture>\n<source srcset=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/css-editor.webp\" type=\"image/webp\">\n<source srcset=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/css-editor.jxl\" type=\"image/jxl\">\n<img src=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/css-editor.png\" loading=\"lazy\" alt=\"Screenshot of the CSS editor panel. It features two panes, the left one listing element-specific styles as well as applicable class-scoped styles. The other pane shows a list of classes, and what objects they apply to.\">\n</picture>\n<figcaption>The newer CSS editor, showing two classes, each with one member. One of the member elements is selected, and its styles are shown in the left pane.</figcaption>\n</figure>\n<p>It is also possible to edit an SVG simultaneously in Inkscape and an external text editor. Inkscape will do mostly fine with rendering externally modified files (provided they're valid SVG), and will pass-through copy such modifications without rewriting them, unless a particular element is modified within Inkscape. It will <em>not</em> live-reload externally modified files—this is something the user has to remember to do themself.</p>\n<h2 id=\"my-svg-authoring-workflow\">My SVG authoring workflow</h2>\n<p>By default, Inkscape will not assign classes to new elements. It will, on the other hand, enthusiastically assign styles on element level. When I am drawing a diagram, I generally do not care about setting these element-level styles, other than temporarily, in order to preview how I want things to look in the end. Copying and pasting styles between elements is what I am trying to avoid, though.</p>\n<p>I like to use Inkscape's alternate display modes. Through <kbd class=\"menu-sequence\"><kbd><samp class=\"menu-entry\">View</samp></kbd> ▸ <kbd><samp class=\"menu-entry\">Display Mode</samp></kbd></kbd>, Inkscape can be set to display all elements as outlines, or to display elements as normal, but add extra outlines to them. This is useful for laying things out, even if their current styles make it less obvious where things are, and where they end.</p>\n<figure class=\"figwide\" id=\"figure-4\">\n<picture>\n<source srcset=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/outline.webp\" type=\"image/webp\">\n<source srcset=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/outline.jxl\" type=\"image/jxl\">\n<img src=\"https://dee.underscore.world/blog/svg-light-dark-color-schemes/outline.png\" loading=\"lazy\" alt=\"A view of the editing area in Inkscape, showing that all the shapes are rendered as black and white outlines, while fonts are rendered filled in black.\">\n</picture>\n<figcaption>Inkscape displaying an image in <i>outline</i> mode. Paths and shapes are shown as outlines, even if they otherwise have fill colors, and even if they would normally cover each other.</figcaption>\n</figure>\n<p>Once an element is mostly where it needs to be, I use the XML editor to give it a class or an id. Just like with HTML elements, SVG elements can have both any number of classes, as well as a unique id. Inkscape does automatically assign everything ids, in a pattern like <code>path1234</code>; the id can be edited later from the XML editor, or the Object Properties dialog. Note that Inkscape also has its own labels, stored under the <code>inkscape:label</code> attribute—this is the name that objects show up under in the layer editor, but it is distinct from the <code>id</code> attribute.</p>\n<p>When elements have some classes assigned, I open the SVG file in a text editor, and start adding a light-theme stylesheet under a <code>&lt;stye&gt;</code> element (directly under the <code>&lt;svg&gt;</code> element). As with ordinary HTML, the default <code>prefers-color-scheme</code> is assumed to be <code>light</code>, so starting with a light theme by default, and then adding a dark one is an approach that makes sense.</p>\n<p>At this point, I generally also open the SVG in a web browser. In theory, SVGs should render the same everywhere, but checking is still a good idea. An SVG viewed in a browser can also be easily refreshed, to show the most recent changes.</p>\n<p>Inkscape, on the other hand, can be forced to reload the SVG via <kbd class=\"menu-sequence\"><kbd><samp class=\"menu-entry\">File</samp></kbd> ▸ <kbd><samp class=\"menu-entry\">Revert</samp></kbd></kbd>. I try to save before moving out of Inkscape, and revert when moving back into it, so that I do not accidentally discard useful changes made in one program by saving over it in another (though text editors are better about handling changes made elsewhere).</p>\n<p>Once I have something in a dark-on-light scheme that looks okay, I start writing the light-on-dark version. I generally enclose the whole set of dark-themed styles behind a <code>@media (prefers-color-scheme: dark)</code> query, though it is possible to intersperse such overrides at multiple points in the sheet. The web browser preview becomes useful at this point, as Inkscape will not read the styles behind the media query. As I normally use a light-on-dark desktop theme (and so my Firefox prefers a <code>dark</code> color scheme), I get a preview of what is behind the media query. Firefox's Element Inspector has handy buttons for forcing <code>prefers-color-scheme</code> to <code>dark</code> or <code>light</code>, which is useful for making sure that everything in the light theme still looks okay afterwards. The Element Inspector can actually also inspect SVG elements in the same way as HTML elements, including modifying their styles.</p>\n<p>Ordinary Inkscape SVGs leave a bunch of Inkscape-specific stuff in the file, useful for editing, but less useful for display. Inkscape also tends to leave unused things in the file—for example, gradients which were applied to an object that was later deleted. This is fine as far as rendering in browser goes, but ideally, the SVG served over the web should not contain a bunch of stuff that will never be used. To that end, Inkscape has the option of saving files as &quot;Plain SVG&quot; (as opposed to &quot;Inkscape SVG&quot;), as well as the option to clean up unused stuff (under <kbd class=\"menu-sequence\"><kbd><samp class=\"menu-entry\">File</samp></kbd> ▸ <kbd><samp class=\"menu-entry\">Clean Up Document</samp></kbd></kbd>). I like to process Inkscape SVGs with <a href=\"https://github.com/svg/svgo\" title=\"SVGO repository\">SVGO</a>, a separate external tool for optimizing SVGs, which also gets rid of redundant or unnecessary things. Inkscape does have a built-in optimizing functionality—when saving as &quot;Optimized SVG&quot;, <a href=\"https://github.com/scour-project/scour\" title=\"Scour repository\">Scour</a> is applied to the file—so the use of SVGO here is a preference.</p>\n<h2 id=\"alternatives\">Alternatives</h2>\n<p>SVGs with media queries in them do not seem very common in the wild. There do exist other ways of adapting images to dark and light color schemes. A common, idiomatic one under HTML is to use <code>&lt;picture&gt;</code> with <code>&lt;source&gt;</code>s conditional on a media query:</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-html\"><code class=\"language-html\"><span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>picture</span><span class=\"token punctuation\">></span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>source</span> <span class=\"token attr-name\">srcset</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>dark-version.svg<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">media</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>(prefers-color-scheme: dark)<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span>\n    <span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;</span>img</span> <span class=\"token attr-name\">src</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>light-version.svg<span class=\"token punctuation\">\"</span></span> <span class=\"token attr-name\">alt</span><span class=\"token attr-value\"><span class=\"token punctuation attr-equals\">=</span><span class=\"token punctuation\">\"</span>Some sort of a fancy drawing<span class=\"token punctuation\">\"</span></span><span class=\"token punctuation\">></span></span> \n<span class=\"token tag\"><span class=\"token tag\"><span class=\"token punctuation\">&lt;/</span>picture</span><span class=\"token punctuation\">></span></span></code></pre>\n<figcaption>Example of <code>&lt;picture&gt;</code> use, where the alternate <code>&lt;source&gt;</code> gets used based on a <code>prefers-color-scheme</code> media query. This technique also works with non-SVG, raster images.</figcaption>\n</figure>\n<p>Although this approach requires two separate SVG files—which presumably duplicate geometry, while changing the style—a browser will normally not download any of the sources besides the selected one.</p>\n<p>Fancier solutions often involve using JavaScript to query the current <code>prefers-color-scheme</code>, and then dynamically loading content based on that. The downside to this is that it requires the user agent to consent to running JavaScript, or fail to load the color scheme–specific content. If possible, it is generally better to use the above methods, which do not require running scripts.</p>\n"},{"id":"https://dee.underscore.world/blog/i-switched-to-eleventy/","url":"https://dee.underscore.world/blog/i-switched-to-eleventy/","title":"I switched to Eleventy","date_published":"2023-04-23T10:15:29.000Z","date_modified":"2023-09-26T19:43:06.000Z","content_html":"<p>When you run your own website—particularly one that includes a blog—you get the privilege of writing, on that website, about building the website. You even get to write about how much of a stereotype it is for a person who built their own personal website to post about how they've built their own personal website. Having now done the latter (if perfunctorily), I shall proceed to do the former.</p>\n<p>I decided to switch the static website generator that I use from <a href=\"https://gohugo.io/\" title=\"The Hugo website\">Hugo</a> to <a href=\"https://www.11ty.dev/\" title=\"The Eleventy website\">Eleventy</a>. I have been using Hugo for some time—I wrote <a href=\"https://dee.underscore.world/blog/moving-to-hugo/\" title=\"Moving to Hugo\">a post about switching to it</a>, back in February 2020—and it is an entirely fine static site generator. Hugo has been around for a while, seems to be fairly popular, is stable, performant, and generally does its job. However, it seemed like certain things that I found tricky to do with Hugo, and some things I potentially might want to do in the future might be easier with Eleventy, and so I decided to switch.</p>\n<h2 id=\"what's-an-eleventy\">What's an Eleventy</h2>\n<p><dfn class=\"keyword\">Eleventy</dfn>, also spelled <dfn class=\"keyword\">11ty</dfn> is a static site generator written in JavaScript. Eleventy describes itself as a &quot;simpler static site generator&quot;. In its overall design it is, indeed, simpler than some of the fancier static site generators out there. The simplicity does not mean that its functionality is limited, though. Rather, Eleventy offers a simple base which can be extended, and upon which projects of various complexity can be built.</p>\n<p>By itself, Eleventy supports maybe eight, nine, or ten template languages (depending on how you are counting). These include <a href=\"https://liquidjs.com/\" title=\"The Liquidjs website\">Liquid</a>, <a href=\"https://mozilla.github.io/nunjucks/\" title=\"The Nunjucks website\">Nunjucks</a>, <a href=\"https://www.11ty.dev/docs/languages/webc/\" title=\"WebC documentation\">WebC</a> (Eleventy's own thing), and <a href=\"https://ejs.co/\" title=\"The EJS website\">EJS</a>. In addition, Eleventy makes it possible to add custom languages. While custom languages will not be as well-integrated with Eleventy as the ones it ships with, the custom language facility is still useful. For example, while Eleventy does not support Sass natively, adding support can be accomplished by pulling in the Sass library (from NPM), and telling Eleventy to process any <code>.sass</code> or <code>.scss</code> files with it. This takes less than a dozen lines of JavaScript.</p>\n<p>New template engines are not the only thing that can be added in Eleventy. For example, by default, Eleventy can load data from JSON files, and the fields from those files can then be used in templates. However, support for other formats can also be added, by passing in a function that takes the input file and outputs a JavaScript object, so support can be added for formats like YAML, or even KDL if feeling fanciful. Another example is the Markdown library which Eleventy uses: the library can, by itself, use plugins, and Eleventy has a way of using an instance of the library with arbitrary plugins added. Scripting can be used in a lot of places—templates themselves can be JavaScript files, for example.</p>\n<p>The project also brags about its performance. While Hugo is faster than Eleventy, <a href=\"https://www.11ty.dev/docs/performance/\" title=\"&quot;Eleventy Performance&quot; – Eleventy Documentation\">Eleventy's own benchmarks</a> indicate that it tends to be faster than a bunch of other popular JavaScript static site generators.</p>\n<h2 id=\"migration\">Migration</h2>\n<p>I decided to port the website over to Eleventy without making any other large changes in the process. The idea was to get to a working state first, so that I can then make further changes, but also can publish new posts, without having to maintain a fork of the old version at the same time.</p>\n<p>Because I like Nix too much, I also wanted to both have a way of building the website as a Nix derivation, and also to have a Nix shell capable of running a live preview server. With the previous site, I had direnv configured to load Hugo via a Nix flake, which meant that I could clone the repository, run<code>direnv allow</code> in it, and be ready to go.</p>\n<p>While Eleventy is not in Nixpkgs itself, it is easy to install with Yarn, and <code>mkYarnModules</code> can be used to produce the required <code>node_modules</code> without overrides. Eleventy itself can be pointed at that <code>node_modules</code>, by using the <code>NODE_PATH</code> environment variable, which works for both putting a working Eleventy in <code>PATH</code>, and building the website as a derivation.</p>\n<p>The actual blog content was not particularly difficult to port. Both Hugo and Eleventy support Markdown, and that is what my articles are all written in. I also use custom shortcodes, but those were easy enough to rewrite for Eleventy. Hugo uses Go templates, while Eleventy, by default, pre-processes Markdown as Liquid templates. This means that the syntax for invoking shortcodes is slightly different, but in practice they are similar enough that a simple search-and-replace is largely sufficient.</p>\n<p>The non-blog portions of my website involved some more free-form rewriting. Eleventy is less rigid than Hugo when it comes to the directory structure of templates and layouts. This is handy for the less frequently edited portions of a website, as they can be kept more simple. Eleventy is also fairly easy to debug—there is plenty of opportunity to <code>console.log()</code> the state (pretty-printed) at various points, even inside templates.</p>\n<p>The downside of the simplicity is that some things are harder to accomplish. For example, under Hugo, a page can be a directory with an <code>index.md</code>, and other, arbitrary files. Hugo will copy these other files as attachments, respecting the path settings in <code>index.md</code>'s front matter. Eleventy, on the other hand, does not really have the concept of a page and its adjacent files in the same way. Fortunately, the extensibility of Eleventy means that such functionality can be bolted on, and there, indeed, already are plugins (see <a href=\"https://github.com/11ty/eleventy/issues/1540\">Eleventy issue #1540</a> if interested).</p>\n<h2 id=\"comparison\">Comparison</h2>\n<p>Comparing static site generators is tricky, because because the category of <i>static site generator</i> encompasses a wide variety of tools. Some generators make it really easy to get going with a blog, and automatically handle things categories, tags, or web feeds; other generators come with less blog-specific functionality, but are more suited for a wide variety of different kinds of websites. Some generators produce traditional, static HTML sites, while others are meant for web apps.</p>\n<p>Prior to Hugo, I used <a href=\"https://www.getlektor.com/\" title=\"The Lektor website\">Lektor</a>. Lektor is less rigid than Hugo. Instead of making assumptions about how input data should be structured, it requires explicit declaration of the structure. I left Lektor for Hugo, and Hugo's advantage was the fact that it comes with all the batteries included, and it <em>mostly</em> did everything I needed it to out of the box, where Lektor required plugins or patching. Then, I switched to Eleventy, which handles input data in a less structured way than Lektor, while also relying on extensibility rather than inclusion of batteries.</p>\n<p>Eleventy works well for me, for what I want to achieve, and so I considered the switch worth it. That does not make it a strictly best static site generator, and whether or not it will work for anyone else depends on their individual use case and needs. Even things like speed benchmarks are relative, since speed can matter less for smaller websites (such as my own). On the other hand, Eleventy is easy to play with, and relatively well-documented, so I can recommend looking into it, if it seems like it would fit your requirements.</p>\n<h2 id=\"addendum\">Addendum</h2>\n<p>I published <a href=\"https://git.underscore.world/d/d-u-w-eleventy-example/\">a repository</a>, which serves as an example of how an Eleventy website can be built as a Nix package, while also providing a direnv environment for working on it..</p>\n<h2>Changes since this article was published</h2><ol><li><b>2023-09-26T19:43:06.000Z</b> – Added the <a href=\"#addendum\">addendum section</a></li></ol>"},{"id":"https://dee.underscore.world/blog/installing-nixos-unconventionally/","url":"https://dee.underscore.world/blog/installing-nixos-unconventionally/","title":"Installing NixOS unconventionally","date_published":"2023-03-20T20:19:15.000Z","date_modified":"2023-03-20T20:19:15.000Z","content_html":"<p>Since release 22.05 &quot;Quokka&quot;, NixOS can be installed via a graphical installer, which makes for an installation experience closer to that of a traditional Linux distribution. NixOS is, however, in many ways not like a traditional Linux distribution.</p>\n<p>The manual way of installing NixOS—the main one available before the introduction of the Calamares-based graphical installer, and still available as an option now—generally goes like this: <code>nixos-generate-config</code> creates some basic configuration files, which the user can then adjust to their needs. <code>nixos-install</code> then builds the first system configuration based on those files, and sets up the machine so that it will boot into NixOS with that first system configuration.</p>\n<p>Like Nixpkgs packages, NixOS system configurations are deterministic and reproducible. Indeed, on the low level, they are the same thing: paths that end up in the Nix store. When <code>nixos-install</code> builds a new NixOS system configuration, it is no different from when, on an existing NixOS system, <code>nixos-rebuild switch</code> builds a new system configuration to switch to. These commands do some extra work, however: <code>nixos-install</code> gets the system bootable, and <code>nixos-rebuild</code> performs the generation switch. As a consequence of this, if we understand all the extra stuff that <code>nixos-install</code> does, we can install NixOS in some unconventional ways.</p>\n<h2 id=\"booting-nixos\">Booting NixOS</h2>\n<p>To understand how we can make a NixOS install bootable, it is useful to understand how NixOS actually boots.</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<img src=\"https://dee.underscore.world/blog/installing-nixos-unconventionally/nixos-boot.svg\" alt=\"Diagram of the boot process as described in the article\">\n<figcaption>A simplified diagram of the NixOS boot process</figcaption>\n</figure>\n<p>The very early stages of booting an x86 machine to NixOS work very much like most other Linux distros. After powering on, the firmware will first get us into the bootloader. How it finds the bootloader depends on whether we're booting in the classic PC BIOS way or the UEFI-GPT way. With BIOS, the firmware looks for the bootloader in the first sectors of the configured storage devices. With UEFI, the firmware consults its own non-volatile memory, which lists available paths that can be booted, or otherwise falls back to a spec-mandated default path.</p>\n<p>With NixOS, the bootloader that we get into will generally be either GRUB or systemd-boot. In either case, the bootloader has one or more boot entries, representing different NixOS system configurations (sometimes called <i>generations</i>). Each entry will point to a kernel, and an <i>initial ramdisk</i> (initrd) image, both of which are files within the boot partition—that is, not in the Nix store. The initrd contains the Stage 1 init script, as well as a number of executable binaries and kernel modules.</p>\n<p>The Stage 1 script's job is mostly to mount whatever filesystems are required for the system to boot. Because this can involve things like LVM or LUKS, the initrd (hopefully) contains all the modules that are required for handling the storage setup present. Also because of this, Stage 1 is when the user is prompted for any LUKS passphrases, if they are required. At its end, the Stage 1 script runs the Stage 2 script (via <code>exec</code>).</p>\n<p>The Stage 2 script is in the freshly mounted Nix store, under the path for the target system configuration (this directory is symlinked from <code>/run/booted-system</code> on a running NixOS system). The bootloader entry from earlier also contained the store path to the Stage 2 script, and this is how the Stage 1 script knew what to run. The Stage 2 script contains the <i>activation scripts</i>.</p>\n<p>The activation script portion of Stage 2 is built from the <code>system.activationScripts</code> configuration option. Its job is to <em>activate</em> the system configuration. A NixOS system configuration contains in it things like files that should go in<code>/etc</code>, configuration of user accounts, or sops-nix/agenix secrets. The system configuration, however, lives inside the Nix store, and we want our <code>/etc</code> files symlinked from, or copied to the top-level <code>/etc</code> directory, our users reflected in <code>/etc/passwd</code>, and our secrets provisioned somewhere under <code>/run</code>. The activation scripts make these things happen.</p>\n<p>After the activation scripts are done, the Stage 2 script finally runs systemd (also via <code>exec</code>). systemd is now running as PID 1, and takes over the rest of the booting. The rest of the boot happens as with any other systemd distro—units get started in the correct order, until systemd reaches the desired target.</p>\n<h2 id=\"implications-for-installing\">Implications for installing</h2>\n<p>If we consider the boot process, it turns out that actually installing NixOS is not really <em>that</em> involved. For example, we <em>mostly</em> do not need to provision anything in the root filesystem (<code>/</code>) ahead of time, as this gets populated at boot time (a fact which some people use to run NixOS with tmpfs mounted on <code>/</code>).</p>\n<p>The path that we do need is the Nix store under <code>/nix</code> (which can be a different filesystem than <code>/</code>). Inside the store we also need to have a system configuration. Since copying a path with its entire closure is part of the core Nix functionality, copying the top level path for a system configuration into the store also copies all the other paths the configuration needs, so this task is relatively simple.</p>\n<p>Outside of the store, we need to install the bootloader, and give it what it will need for booting Stage 1. Installing a bootloader is easy enough. Both GRUB and systemd-boot provide scripts for installing them, and these are wrapped in a distro-provided script. Provisioning the initial ramdisk is more complicated. The initrd is derived from the system configuration, but needs to be outside of it, inside the boot partition. To boot the system configuration, its entry needs to be added to the bootloader's config files, and its initrd needs to be placed somewhere where the bootloader can reach it.</p>\n<p>Another observation we can confirm here is that installing a new NixOS system, and switching an existing NixOS install to a new generation are actually mostly the same thing. The one thing that may differ is installing the bootloader: when setting up a new generation for an existing NixOS install, we can assume the bootloader is already there; when installing NixOS anew, we need to actually install the bootloader first.</p>\n<h2 id=\"the-unconventional-setup\">The unconventional setup</h2>\n<p>Let's actually install NixOS, then. I am going to be installing to a host called <i>sillyvm</i>, which is a virtual machine (though it does not have to be). I am also going to use a second machine, called <i>buildbox</i>. <i>buildbox</i> is where the system configuration will be built, and while it is the VM's host, it could also be  elsewhere.</p>\n<p>Let us start by booting a live USB image on <i>sillyvm</i>. The reasonable thing to do here would be to boot a NixOS live image—either one of the generic ones that Hydra builds (these are the ones available on the NixOS website), or perhaps one we built ourselves, after customizing it with things like our public SSH key. So, let's instead use the Fedora Workstation 37 live image. It is relatively up to date, and has a bunch of tools useful for installing Linux.</p>\n<p>After booting the live image and setting up sshd (which is already installed, and can be started by starting the <code>sshd.service</code> unit via <code>systemctl</code>), we can set up the storage. This works the same as with the NixOS live image, and so <a href=\"https://nixos.org/manual/nixos/unstable/index.html#sec-installation-manual-partitioning\" title=\"NixOS Manual, &quot;Installing NixOS&quot; section\">the instructions in the manual</a> work, while GPT fdisk and fdisk from util-linux are also available. We'll want to boot this install via UEFI, so we'll set up a generous, 512 MiB EFI system partition, followed by a single Linux partition for NixOS, spanning the remaining storage volume. The EFI partition has to be formatted with FAT32, and we are going to make the NixOS root partition Btrfs. Just like with NixOS, we then mount the root partition under <code>/mnt</code>, and the EFI partition under <code>/mnt/boot</code>.</p>\n<h2 id=\"hardware-configuration\">Hardware configuration</h2>\n<p>With a normal NixOS install, at this stage we might want to run <code>nixos-generate-config</code>. The conventional approach spits out a template configuration file and a generated <code>hardware-configuration.nix</code> to <code>/etc/nixos</code>. The template file is handy for getting started, but not strictly necessary, and we are going to be writing the configuration files on <i>buildbox</i> anyway. What <em>would</em> be handy is the hardware configuration file, since this is hardware-specific (<i>hardware</i> here includes the partitions present). For this use case, <code>nixos-generate-config</code> has the <code>--show-hardware-config</code> switch, which outputs the hardware configuration to standard output.</p>\n<p><code>nixos-generate-config</code> is in the <code>nixos-install-tools</code> package… but we do not have Nix on the booted Fedora system, so we can't easily <code>nix shell</code> into that package. We also can't copy it into the system from <i>buildbox</i>, since Nix is not available on <i>sillyvm</i>.</p>\n<figure class=\"figwide\" id=\"listing-1\">\n<pre class=\"language-shell-session\"><code class=\"language-shell-session\"><span class=\"token output\">buildbox$ nix build nixpkgs#nixos-install-tools\nbuildbox$ nix copy --to ssh-ng://root@sillyvm ./result\nbash: nix-daemon: command not found\nerror: cannot open connection to remote store 'ssh-ng://root@sillyvm': error: unexpected end-of-file\n</span></code></pre>\n<figcaption>Unsuccessfully trying to copy a package to the booted live environment</figcaption>\n</figure>\n<p>Okay, but if we <code>nix copy</code> to a bare local path (as opposed to a <code>file:</code> path), Nix will create a whole new store. What if we just copied that over?</p>\n<figure class=\"figwide\" id=\"listing-2\">\n<pre class=\"language-shell-session\"><code class=\"language-shell-session\"><span class=\"token output\">buildbox$ nix copy --to ./staging-live ./result\nbuildbox$ rsync --recursive --links staging-live/nix root@sillyvm:/\nbuildbox$ readlink ./result \n/nix/store/hin13xa2w815spnykrxjj2q0600vf6c5-nixos-install-tools-23.05pre-git\nbuildbox$ ssh root@sillyvm\nsillyvm# /nix/store/hin13xa2w815spnykrxjj2q0600vf6c5-nixos-install-tools-23.05pre-git/bin/nixos-generate-config --show-hardware-config --root /mnt > /tmp/hardware-configuration.nix\n</span></code></pre>\n<figcaption>Giving the live environment booted on <i>sillyvm</i> a Nix store of its own, and running <code>nixos-generate-config</code> from it</figcaption>\n</figure>\n<p>Turns out that this works, for the most part.</p>\n<h2 id=\"writing-the-configuration\">Writing the configuration</h2>\n<p>Coming back to <i>buildbox</i> with the <code>hardware-configuration.nix</code> file we generated on <i>sillyvm</i>, we can now write the rest of the configuration. Channels represent mutable state, so if we opt to use flakes instead, we can make things a bit easier for ourselves.</p>\n<p>There are many ways people organize their flakeified NixOS configurations (there is <a href=\"https://nixos.wiki/wiki/Configuration_Collection\" title=\"Configuration Collection\">a list of configuration repositories</a> on the NixOS wiki that features various examples), but we'll start with something straightforward. The <code>flake.nix</code> can look something like this:</p>\n<figure class=\"figwide\" id=\"listing-3\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span>\n  inputs <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    nixpkgs<span class=\"token punctuation\">.</span>url <span class=\"token operator\">=</span> <span class=\"token string\">\"github:NixOS/nixpkgs/nixos-unstable\"</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  outputs <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span> self<span class=\"token punctuation\">,</span> nixpkgs <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span> <span class=\"token punctuation\">{</span>\n    nixosConfigurations<span class=\"token punctuation\">.</span>sillyvm <span class=\"token operator\">=</span> nixpkgs<span class=\"token punctuation\">.</span>lib<span class=\"token punctuation\">.</span>nixosSystem <span class=\"token punctuation\">{</span>\n      system <span class=\"token operator\">=</span> <span class=\"token string\">\"x86_64-linux\"</span><span class=\"token punctuation\">;</span>\n      modules <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n        <span class=\"token url\">./sillyvm/configuration.nix</span>\n      <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption><code>flake.nix</code>, with one output for <i>sillyvm</i>'s configuration</figcaption>\n</figure>\n<p>We can them write a minimal configuration:</p>\n<figure class=\"figwide\" id=\"listing-4\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> pkgs<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span>\n<span class=\"token punctuation\">{</span>\n  imports <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n    <span class=\"token url\">./hardware-configuration.nix</span>\n  <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n  networking<span class=\"token punctuation\">.</span>hostName <span class=\"token operator\">=</span> <span class=\"token string\">\"sillyvm\"</span><span class=\"token punctuation\">;</span>\n\n  nix<span class=\"token punctuation\">.</span>settings<span class=\"token punctuation\">.</span>trusted<span class=\"token operator\">-</span>users <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"@wheel\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n  users<span class=\"token punctuation\">.</span>users<span class=\"token punctuation\">.</span>dee <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    isNormalUser <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n    extraGroups <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"wheel\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n    initialHashedPassword <span class=\"token operator\">=</span> <span class=\"token string\">\"$y$j9T$9pWy7dpdcT3ELJW3cnqq31$FhQdf.mSiO8W2xo4GvWa6wTBAO2DY0Q76tgUyzc9ra2\"</span><span class=\"token punctuation\">;</span>\n\n    openssh<span class=\"token punctuation\">.</span>authorizedKeys<span class=\"token punctuation\">.</span>keys <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n      <span class=\"token string\">\"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDLXX8q0EPySd37fsNWH6LCPlVpdmVfW4X1+4ATtVL+9 dee\"</span>\n    <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  services<span class=\"token punctuation\">.</span>openssh<span class=\"token punctuation\">.</span>enable <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n\n  environment<span class=\"token punctuation\">.</span>systemPackages <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span>\n    pkgs<span class=\"token punctuation\">.</span>bunnyfetch\n  <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n  system<span class=\"token punctuation\">.</span>stateVersion <span class=\"token operator\">=</span> <span class=\"token string\">\"23.05\"</span><span class=\"token punctuation\">;</span> \n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption>A minimal <code>sillyvm/configuration.nix</code></figcaption>\n</figure>\n<p>By default, <code>nixos-install</code> prompts for a password for the root user on the newly installed system. If we end up skipping that step, we can end up booting into a system where we have no way to grab root at all. To work around this, we add our user to the group <code>wheel</code>, which, by default, has <code>sudo</code> permissions. We also provide a password hash (made with <code>mkpasswd</code>) and an ssh key that can be used to log into the account. Putting the verbatim password hash here is not the <em>most</em> secure thing (the hash ends up world-readable in the Nix store), and if we were using something like <a href=\"https://github.com/ryantm/agenix\" title=\"agenix repository\">agenix</a> or <a href=\"https://github.com/Mic92/sops-nix\" title=\"sops-nix repository\">sops-nix</a>, we could use that to provision the password instead.</p>\n<p>Let's look at the generated hardware configuration file we pulled from <i>sillyvm</i>:</p>\n<figure class=\"figwide\" id=\"listing-5\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token comment\"># Do not modify this file!  It was generated by ‘nixos-generate-config’</span>\n<span class=\"token comment\"># and may be overwritten by future invocations.  Please make changes</span>\n<span class=\"token comment\"># to /etc/nixos/configuration.nix instead.</span>\n<span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> lib<span class=\"token punctuation\">,</span> pkgs<span class=\"token punctuation\">,</span> modulesPath<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span>\n\n<span class=\"token punctuation\">{</span>\n  imports <span class=\"token operator\">=</span>\n    <span class=\"token punctuation\">[</span> <span class=\"token punctuation\">(</span>modulesPath <span class=\"token operator\">+</span> <span class=\"token string\">\"/profiles/qemu-guest.nix\"</span><span class=\"token punctuation\">)</span>\n    <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n  boot<span class=\"token punctuation\">.</span>initrd<span class=\"token punctuation\">.</span>availableKernelModules <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"ahci\"</span> <span class=\"token string\">\"virtio_pci\"</span> <span class=\"token string\">\"virtio_blk\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  boot<span class=\"token punctuation\">.</span>initrd<span class=\"token punctuation\">.</span>kernelModules <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  boot<span class=\"token punctuation\">.</span>kernelModules <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"kvm-intel\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  boot<span class=\"token punctuation\">.</span>extraModulePackages <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n  fileSystems<span class=\"token punctuation\">.</span><span class=\"token string\">\"/\"</span> <span class=\"token operator\">=</span>\n    <span class=\"token punctuation\">{</span> device <span class=\"token operator\">=</span> <span class=\"token string\">\"/dev/disk/by-uuid/25bfe810-b192-4a42-b912-f06ab09d7994\"</span><span class=\"token punctuation\">;</span>\n      fsType <span class=\"token operator\">=</span> <span class=\"token string\">\"btrfs\"</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  fileSystems<span class=\"token punctuation\">.</span><span class=\"token string\">\"/boot\"</span> <span class=\"token operator\">=</span>\n    <span class=\"token punctuation\">{</span> device <span class=\"token operator\">=</span> <span class=\"token string\">\"/dev/disk/by-uuid/352C-2417\"</span><span class=\"token punctuation\">;</span>\n      fsType <span class=\"token operator\">=</span> <span class=\"token string\">\"vfat\"</span><span class=\"token punctuation\">;</span>\n    <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  swapDevices <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n  <span class=\"token comment\"># Enables DHCP on each ethernet and wireless interface. In case of scripted networking</span>\n  <span class=\"token comment\"># (the default) this is the recommended approach. When using systemd-networkd it's</span>\n  <span class=\"token comment\"># still possible to use this option, but it's recommended to use it in conjunction</span>\n  <span class=\"token comment\"># with explicit per-interface declarations with `networking.interfaces.&lt;interface>.useDHCP`.</span>\n  networking<span class=\"token punctuation\">.</span>useDHCP <span class=\"token operator\">=</span> lib<span class=\"token punctuation\">.</span>mkDefault <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n  <span class=\"token comment\"># networking.interfaces.enp1s0.useDHCP = lib.mkDefault true;</span>\n\n  nixpkgs<span class=\"token punctuation\">.</span>hostPlatform <span class=\"token operator\">=</span> lib<span class=\"token punctuation\">.</span>mkDefault warning<span class=\"token punctuation\">:</span> the group 'nixbld' specified <span class=\"token keyword\">in</span> 'build<span class=\"token operator\">-</span>users<span class=\"token operator\">-</span>group' does not exist<span class=\"token punctuation\">;</span>\n  hardware<span class=\"token punctuation\">.</span>cpu<span class=\"token punctuation\">.</span>intel<span class=\"token punctuation\">.</span>updateMicrocode <span class=\"token operator\">=</span> lib<span class=\"token punctuation\">.</span>mkDefault config<span class=\"token punctuation\">.</span>hardware<span class=\"token punctuation\">.</span>enableRedistributableFirmware<span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption><code>hardware-configuration.nix</code>, as it came out of <code>nixos-generate-config</code> on <i>sillyvm</i></figcaption>\n</figure>\n<p>While the file starts with an admonition to not edit it, that does not really apply to us. Re-running <code>nixos-generate-config</code> on a live NixOS system can overwrite <code>/etc/nixos/hardware-configuration.nix</code>, but we are on a different machine entirely right now, and will not be using or even touching <code>/etc/nixos</code> at all.</p>\n<p>There is also a weird thing that happened with <code>nixpkgs.hostPlatform</code>. This part is populated by <code>nixos-generate-config</code> executing <code>nix-instantiate</code>. As Nix is not properly setup in our live environment (we literally just rsynced a store to it), <code>nix-instantiate</code> spits out an error message to standard error, which <code>nixos-generate-config</code> fails to ignore. Arguably, this is a bug, though it only comes up in exotic circumstances, such as our current ones.</p>\n<p>Anyway, let's make some changes:</p>\n<figure class=\"figwide\" id=\"listing-6\">\n<pre class=\"language-nix\"><code class=\"language-nix\"><span class=\"token punctuation\">{</span> config<span class=\"token punctuation\">,</span> lib<span class=\"token punctuation\">,</span> pkgs<span class=\"token punctuation\">,</span> modulesPath<span class=\"token punctuation\">,</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token punctuation\">}</span><span class=\"token punctuation\">:</span>\n\n<span class=\"token punctuation\">{</span>\n  imports <span class=\"token operator\">=</span>\n    <span class=\"token punctuation\">[</span> <span class=\"token punctuation\">(</span>modulesPath <span class=\"token operator\">+</span> <span class=\"token string\">\"/profiles/qemu-guest.nix\"</span><span class=\"token punctuation\">)</span>\n    <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n\n  boot<span class=\"token punctuation\">.</span>initrd<span class=\"token punctuation\">.</span>availableKernelModules <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"ahci\"</span> <span class=\"token string\">\"virtio_pci\"</span> <span class=\"token string\">\"virtio_blk\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  boot<span class=\"token punctuation\">.</span>initrd<span class=\"token punctuation\">.</span>kernelModules <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  boot<span class=\"token punctuation\">.</span>kernelModules <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"kvm-intel\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  boot<span class=\"token punctuation\">.</span>extraModulePackages <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  boot<span class=\"token punctuation\">.</span>loader <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span>\n    systemd<span class=\"token operator\">-</span>boot<span class=\"token punctuation\">.</span>enable <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n    efi<span class=\"token punctuation\">.</span>canTouchEfiVariables <span class=\"token operator\">=</span> <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  fileSystems<span class=\"token punctuation\">.</span><span class=\"token string\">\"/\"</span> <span class=\"token operator\">=</span> <span class=\"token punctuation\">{</span> \n    device <span class=\"token operator\">=</span> <span class=\"token string\">\"/dev/disk/by-uuid/25bfe810-b192-4a42-b912-f06ab09d7994\"</span><span class=\"token punctuation\">;</span>\n    fsType <span class=\"token operator\">=</span> <span class=\"token string\">\"btrfs\"</span><span class=\"token punctuation\">;</span>\n    options <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"noatime\"</span> <span class=\"token string\">\"discard=async\"</span><span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  fileSystems<span class=\"token punctuation\">.</span><span class=\"token string\">\"/boot\"</span> <span class=\"token operator\">=</span><span class=\"token punctuation\">{</span> \n    device <span class=\"token operator\">=</span> <span class=\"token string\">\"/dev/disk/by-uuid/352C-2417\"</span><span class=\"token punctuation\">;</span>\n    fsType <span class=\"token operator\">=</span> <span class=\"token string\">\"vfat\"</span><span class=\"token punctuation\">;</span>\n    options <span class=\"token operator\">=</span> <span class=\"token punctuation\">[</span> <span class=\"token string\">\"noatime\"</span> <span class=\"token string\">\"discard\"</span> <span class=\"token punctuation\">]</span><span class=\"token punctuation\">;</span>\n  <span class=\"token punctuation\">}</span><span class=\"token punctuation\">;</span>\n\n  networking<span class=\"token punctuation\">.</span>useDHCP <span class=\"token operator\">=</span> lib<span class=\"token punctuation\">.</span>mkDefault <span class=\"token boolean\">true</span><span class=\"token punctuation\">;</span>\n\n  nixpkgs<span class=\"token punctuation\">.</span>hostPlatform <span class=\"token operator\">=</span> <span class=\"token string\">\"x86_64-linux\"</span><span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre>\n<figcaption><code>hardware-configuration.nix</code> with our changes</figcaption>\n</figure>\n<p>We have added some mount options (fun fact: Btrfs mounts with <code>discard=async</code> under Linux 6.2 by default), but the filesystem settings look fine otherwise. More complicated setups—for example, ones involving LUKS—might need further tweaking here.</p>\n<p>More importantly, we've configured the bootloader. <code>boot.loader.systemd-boot.enable</code> tells NixOS that we want the bootloader to be systemd-boot. systemd-boot only works under UEFI systems, but it's a less complicated option than GRUB. Of importance is also <code>boot.loader.efi.canTouchEfiVariables</code>, which does what it sounds like—when set to <code>true</code>, NixOS is allowed to add an entry for the bootloader to the machine's efivars.</p>\n<p>Okay, now we can build the system configuration:</p>\n<figure class=\"figwide\" id=\"listing-7\">\n<pre class=\"language-shell-session\"><code class=\"language-shell-session\"><span class=\"token output\">buildbox$ nix build .#nixosConfigurations.sillyvm.config.system.build.toplevel\n</span></code></pre>\n<figcaption>Building the system configuration from shell on <i>buildbox</i></figcaption>\n</figure>\n<p>This may look like a bit of arcane incantation, but we can break it down part by part.</p>\n<ol>\n<li><code>.#</code> tells Nix which flake we want—the flake in the current directory.</li>\n<li><code>nixosConfigurations.sillyvm</code> is the output we declared in <code>flake.nix</code>.</li>\n<li><code>config</code> is the same <code>config</code> we use inside configuration files, which is to say it is the final effective configuration with all the options defined; in the same way, we can, for example, do <code>nix eval .#nixosConfigurations.sillyvm.config.networking.hostName</code> to get the final, effective value for <code>networking.hostName</code>.</li>\n<li>Finally, <code>system.build.toplevel</code> is an option inside the config. The definition of an option can depend on the definitions of other options, and <code>system.build.toplevel</code> is simply a derivation that happens to depend on the definitions of essentially all the other system configuration options. The derivation's output is a system configuration, and it is what we are building.</li>\n</ol>\n<h2 id=\"deployment\">Deployment</h2>\n<p><code>nix build</code> will have given us a <code>./result</code> with the system configuration, but it's on the wrong machine. Fortunately, we can just repeat our silly rsync trick again:</p>\n<figure class=\"figwide\" id=\"listing-8\">\n<pre class=\"language-shell-session\"><code class=\"language-shell-session\"><span class=\"token output\">buildbox$ nix copy --to ./staging-system ./result\nbuildbox$ readlink ./result\n/nix/store/zq0kxlawqf1vbkz8yn85y6n6kwadx02g-nixos-system-sillyvm-23.05.20230222.988cc95\nbuildbox$ rsync --recursive --links ./staging-system/nix root@sillyvm:/mnt\n</span></code></pre>\n<figcaption>Copying a Nix store to <i>sillyvm</i>, but this time to its storage, rather than the live system's tmpfs</figcaption>\n</figure>\n<p>This covers the part about having the system configuration in the Nix store on the target system, for the most part. Now we have to install the bootloader and populate it with the initial ramdisk, and the corresponding bootloader entry. The infrastructure for doing both of those things is actually already in our freshly built system configuration.</p>\n<p>We want to invoke the system configuration's script for installing the bootloader… but that script is in the Nix store under <code>/mnt/nix</code>. If the script refers to an absolute path under <code>/nix/store</code>, that will point to our live environment's store, not the target system's store. This means we need to chroot ourselves to <code>/mnt</code> before invoking the install script.</p>\n<p>Fortunately, the <code>nixos-install-tools</code> package we rsync'd to the live environment earlier also comes with the <code>nixos-enter</code> script, which is essentially <code>chroot</code> into a NixOS system, along with some handy setup steps, like setting the locale paths or bind-mounting the host system's <code>/dev</code> and <code>/sys</code> (handy if we want to modify those EFI variables).<code>nixos-enter</code> will refuse to enter a system that is not NixOS, but a NixOS system is simply one where <code>/etc/NIXOS</code> is present, so we can get our nascent install recognized as NixOS easily.</p>\n<figure class=\"figwide\" id=\"listing-9\">\n<pre class=\"language-shell-session\"><code class=\"language-shell-session\"><span class=\"token output\">sillyvm# mkdir -p /mnt/etc\nsillyvm# touch /mnt/etc/NIXOS\nsillyvm# /nix/store/hin13xa2w815spnykrxjj2q0600vf6c5-nixos-install-tools-23.05pre-git/bin/nixos-enter \\\n>  --root /mnt \\\n>  --system /nix/store/zq0kxlawqf1vbkz8yn85y6n6kwadx02g-nixos-system-sillyvm-23.05.20230222.988cc95 \\\n>  --command 'exec /nix/store/zq0kxlawqf1vbkz8yn85y6n6kwadx02g-nixos-system-sillyvm-23.05.20230222.988cc95/sw/bin/bash'\nsetting up /etc...\n\n</span><span class=\"token command\"><span class=\"token info punctuation\"><span class=\"token user\">[root@sillyvm</span><span class=\"token punctuation\">:</span><span class=\"token path\">/]</span></span><span class=\"token shell-symbol important\">#</span> </span></code></pre>\n<figcaption>Using <code>nixos-enter</code> to enter our new NixOS install</figcaption>\n</figure>\n<p>We had to go digging in our new system configuration for Bash, because otherwise <code>nixos-enter</code> would have trouble finding it, but there we are… in the NixOS install of <i>sillyvm</i>, kind of.</p>\n<p>Now we need to do two things: set our new system configuration as the current system generation, and then run the activation script to install and configure the bootloader.</p>\n<p>Under NixOS, the system generations are tracked via the <code>/nix/var/nix/profiles/system</code> profile. This profile does not exist in our store at this point, but we can simply use <code>nix-env</code> to create it:</p>\n<figure class=\"figwide\" id=\"listing-10\">\n<pre class=\"language-shell-session\"><code class=\"language-shell-session\"><span class=\"token command\"><span class=\"token info punctuation\"><span class=\"token user\">[root@sillyvm</span><span class=\"token punctuation\">:</span><span class=\"token path\">/]</span></span><span class=\"token shell-symbol important\">#</span> <span class=\"token bash language-bash\">nix-env <span class=\"token parameter variable\">--profile</span> /nix/var/nix/profiles/system <span class=\"token punctuation\">\\</span>\n<span class=\"token operator\">></span>  <span class=\"token parameter variable\">--set</span> /nix/store/zq0kxlawqf1vbkz8yn85y6n6kwadx02g-nixos-system-sillyvm-23.05.20230222.988cc95</span></span>\n</code></pre>\n<figcaption>Configuring our system configuration as the current system generation</figcaption>\n</figure>\n<p>Okay, now for the activation script. The <code>NIXOS_INSTALL_BOOTLOADER</code> environment variable can be set to make the activation script install the bootloader (predictably enough). <code>nixos-enter</code> actually set <code>/run/current-system</code> to point to the our system configuration, so we don't have to keep pasting the long path, and can just do this:</p>\n<figure class=\"figwide\" id=\"listing-11\">\n<pre class=\"language-shell-session\"><code class=\"language-shell-session\"><span class=\"token command\"><span class=\"token info punctuation\"><span class=\"token user\">[root@sillyvm</span><span class=\"token punctuation\">:</span><span class=\"token path\">/]</span></span><span class=\"token shell-symbol important\">#</span> <span class=\"token bash language-bash\"><span class=\"token assign-left variable\">NIXOS_INSTALL_BOOTLOADER</span><span class=\"token operator\">=</span><span class=\"token number\">1</span> /run/current-system/bin/switch-to-configuration boot</span></span>\n<span class=\"token output\">Initializing machine ID from VM UUID.\nCreated \"/boot/EFI\".\nCreated \"/boot/EFI/systemd\".\nCreated \"/boot/EFI/BOOT\".\nCreated \"/boot/loader\".\nCreated \"/boot/loader/entries\".\nCreated \"/boot/EFI/Linux\".\nCopied \"/nix/store/xhdxx70inipwzif62dq7m3p3acpq9hcg-systemd-252.5/lib/systemd/boot/efi/systemd-bootx64.efi\" to \"/boot/EFI/systemd/systemd-bootx64.efi\".\nCopied \"/nix/store/xhdxx70inipwzif62dq7m3p3acpq9hcg-systemd-252.5/lib/systemd/boot/efi/systemd-bootx64.efi\" to \"/boot/EFI/BOOT/BOOTX64.EFI\".\nRandom seed file /boot/loader/random-seed successfully written (32 bytes).\nNot installing system token, since we are running in a virtualized environment.\nCreated EFI boot entry \"Linux Boot Manager\".\n\n</span></code></pre>\n<figcaption>Installing the bootloader</figcaption>\n</figure>\n<p>Cool. Now we can <kbd><kbd>Ctrl</kbd>+<kbd>D</kbd></kbd> out of the chroot, and reboot the live environment (or shut it down and switch the boot device in the VM configuration). Then, if everything went well, we should be able to boot into the NixOS system.</p>\n<figure class=\"figwide\" id=\"figure-2\">\n<picture>\n<source srcset=\"https://dee.underscore.world/blog/installing-nixos-unconventionally/sillyvm-installed.webp\" type=\"image/webp\">\n<img src=\"https://dee.underscore.world/blog/installing-nixos-unconventionally/sillyvm-installed.png\" alt=\"screenshot of the console of sillyvm, booted into NixOS, logged into the 'dee' account, and displaying the output of bunnyfetch (a minimal fetch program featuring an ascii-art bunny)\">\n</picture>\n<figcaption>Booted into our new NixOS install 🎉</figcaption>\n</figure>\n<h2 id=\"on-practicality\">On practicality</h2>\n<p>Is this a practical way to install NixOS? No, not really.</p>\n<p>If we wanted to use a downloaded live image, then one of the NixOS live images would have been the best choice for installing NixOS. If that is not available—such as when using a cloud server from a provider who does not offer the ability to boot arbitrary live images, and provides a selection that does not include NixOS—another option would be booting another Linux live image, and installing Nix into it. A number of distributions already have Nix in their repositories, and if that is not available, there are always the <a href=\"https://nixos.org/download.html\" title=\"&quot;Download Nix&quot; on nixos.org\">Nix install scripts</a>.</p>\n<p>Having actual, properly installed Nix in the live environment makes some things easier: instead of rsyncing a Nix store to <code>/mnt</code>, we could do <code>nix copy --to ssh-ng://sillyvm?remote-store=/mnt</code>. <code>nixos-install</code> has a <code>--system</code> option, which can be pointed at the freshly copied system configuration, meaning we can still build it elsewhere first. The installation process would be similar internally, but the actual tools account for some edge cases, and so are far less likely to break in weird ways, compared to our unconventional methods.</p>\n<p>Nevertheless, understanding how the install process works allows us to pull off some unconventional install methods that are actually practical.</p>\n<h2 id=\"further-reading\">Further reading</h2>\n<ul>\n<li>The &quot;<a href=\"https://nixos.org/manual/nixos/unstable/index.html#sec-installation-additional-notes\">Additional installation notes</a>&quot; section of the NixOS manual contains notes on installing from other distros, including both live images and installed systems, as well as other things, like launching NixOS via kexec</li>\n<li>A good place to start exploring the NixOS boot process is the <a href=\"https://github.com/NixOS/nixpkgs/tree/master/nixos/modules/system/boot\">relevant code inside Nixpkgs</a></li>\n<li><a href=\"https://github.com/elitak/nixos-infect\">nixos-infect</a> – an alternative to NixOS's built in NIXOS_LUSTRATE for installing over existing Linux installs</li>\n<li><a href=\"https://github.com/nix-community/nixos-generators\">nixos-generators</a> – a way of outputting things like bootable ISO or kexec images from NixOS configurations</li>\n</ul>\n"},{"id":"https://dee.underscore.world/blog/spoiling-harry-potter/","url":"https://dee.underscore.world/blog/spoiling-harry-potter/","title":"The recurring practice of spoiling Harry Potter","date_published":"2023-02-10T22:35:35.000Z","date_modified":"2023-02-10T22:35:35.000Z","content_html":"<p>As the release of <cite>Hogwarts Legacy</cite> approached, certain details about the game's plot became public. This has led people to share those details with those who may not necessarily wish to know those details ahead of playing the game; that is to say it led people to post spoilers.</p>\n<p>Interestingly enough, this isn't the first time that spoiling a Harry Potter thing has become an Internet meme. Another time was over 17 years earlier, in 2005, at the point when the book <cite>Harry Potter and the Half-Blood Prince</cite> was first released.</p>\n<p>While these two points in time share symmetry, the context around them is quite different, and it is those differences that highlight the arc that the world took over that span of time. 17 years ago, the earliest elements of the modern Internet were just getting their start, and today we can see where they eventually arrived.</p>\n<h2 id=\"snape-kills-dumbledore\">Snape kills Dumbledore</h2>\n<p>The sixth Harry Potter book, <cite>Harry Potter and the Half-Blood Prince</cite>, was scheduled to release on 16th of July, 2005. By the time the series got to number six, it was already widely popular, and its author—J.K. Rowling—has already made ridiculous amounts of money off it. There were movies, video games, and assorted other stuff that comes with a popular franchise. The book release was highly anticipated. A store in Canada accidentally sold a couple of copies to some fans prior to the official release, and the buyers were subsequently <a href=\"https://www.theguardian.com/culture/culturevultureblog/2005/jul/12/harrypotteran\" title=\"Michelle Pauli. &quot;Harry Potter and the Injunction of Fire&quot;. The Guardian, July 12, 2005\">prohibited by court order from even reading their copies</a>. It was one of those releases that cause lines around the block.</p>\n<p>The details of the plot of the book became generally known at about release time. There was enough Internet in 2005 for that knowledge to spread around to those who sought it. There was also enough Internet to link unsuspecting fans of Harry Potter to said spoilers, in the manner of a shock site (something that was already a thing by the mid-2000s).</p>\n<p>In 2005, Facebook was still restricted to university students, and Twitter did not exist yet. <i>Social media</i>, in fact, was not even a term in common use. 2005 did have ways of shooting digital video, and ways of publishing that video on the Internet—YouTube launched in February of 2005, although earlier and jankier ways were available prior to that. This is how we can still watch a blurry, low-resolution, and low-framerate <a href=\"https://www.youtube.com/watch?v=4x_WUb68RQo\" title=\"&quot;Harry Potter Spoiled&quot; on YouTube\">video</a> of someone rolling in a car past a bunch of Harry Potter fans and yelling out &quot;Snape kills Dumbledore&quot;.</p>\n<p>What was also around in 2005 is 4chan—it has been around since 2003, and so predates Facebook, Twitter, and YouTube. 4chan did take notice of the <cite>The Half-Blood Prince</cite> spoilers. Two years later, <cite>Harry Potter and the Deathly Hallows</cite> leaked despite the publisher's likewise extensive efforts to prevent that. 4chan users took it upon themselves to spoil the book to fans queuing up in physical spaces, in imitation of the driver from 2005. This is why if, today, you look for videos of Harry Potter being spoiled, you are more likely to find videos related to <cite>The Deathly Hallows</cite>, including some featuring white British lads in suits and afro wigs yelling spoilers over megaphones, a weird artifact of contemporary Internet culture.</p>\n<h2 id=\"your-teacher-dies-in-every-ending\">Your teacher dies in every ending</h2>\n<p><cite>Hogwarts Legacy</cite> is the first major video game set in the Harry Potter universe to release in years. As one of those high-budget video games long in development, broad in scope, and high in marketing spending, it has been eagerly anticipated by those fans of the series that still remain fans.</p>\n<p>By 2023, J.K. Rowling has revealed herself to be an odious transphobe, and became a prominent voice in the so-called debate on whether transgender people should have rights—on the <i>no</i> side. While Rowling had no involvement in writing for the game, she did create the franchise, and so receives royalty payments from the various properties under it. Trans people, and others with decent opinions on trans people, have pointed out that purchasing the game gives money to a notorious bigot, and have asked people to refrain from doing so.</p>\n<p>Other controversies included the fact that Troy Leavitt, the game's lead designer, <a href=\"https://kotaku.com/hogwarts-legacy-lead-designer-used-to-run-anti-social-j-1846316222\" title=\"Ian Walker. &quot;Hogwarts Legacy Lead Designer Used To Run Anti-Social Justice YouTube Channel,&quot; Kotaku, February 21, 2021.\">used to run a YouTube channel with anti-feminist and anti–social justice content</a>. The game's content also attracted some criticism: the plot focuses on goblins which, within the Harry Potter universe, are <a href=\"https://wegotthiscovered.com/gaming/hogwarts-legacy-faces-even-more-undeniable-evidence-of-being-antisemitic-as-details-about-goblin-rebellion-match-horrifying-real-life-events/\" title=\"Jon Silman. &quot;‘Hogwarts Legacy’ Faces Even More Undeniable Evidence of Being Antisemitic as Details about Goblin Rebellion Match Horrifying Real-Life Events.&quot; We Got This Covered, February 9, 2023.\">hook-nosed greedy bankers</a>—an antisemitic trope in the real world.</p>\n<p>All of this has led some of the aforementioned people with decent opinions to spoil the game for unsuspecting fans who have not yet played through it. A common copypasta for this is thus:</p>\n<blockquote>\n<p>Your teacher, Professor Eleazar Fig, dies at the end of Hogwarts Legacy. This happens in all possible endings and can't be changed. Oh and Rookwood is the one who cursed Anne while the goblins were framed</p>\n</blockquote>\n<p>Of course, people these days generally don't line up at a store to purchase video games on physical media, so the spoiling has been chiefly been taking place online. On the other hand, scrolling through social media timelines is a far more common practice today, so those present a ready place for the spoilers.</p>\n<h2 id=\"the-arc-of-the-internet\">The arc of the Internet</h2>\n<p>The two points in time offer an interesting look into the arc that the Internet took over a span of over 17 years. In 2005, pocket computers were an esoteric gadget, and the infrastructure to comfortably operate them everywhere was not yet there. Social media was nascent, and spending a lot of time talking to people over the Internet was the domain of weird nerds, rather than a normal part of everyday life for a large portion of the population.</p>\n<p>What existed was memes, and what existed was spaces on the Internet inhabited by the aforementioned weird nerds. 4chan is, perhaps, the most well-known of them, as it remained notable and notorious over the following decade and beyond. Looking at the artifacts of that era's Internet, we can glimpse a particular culture, which is also exemplified by the 2005 and 2007 spoiling memes.</p>\n<p>2000's shenanigans were, notably, not particularly political. Fans of the Harry Potter franchise were not categorized as any particular political camp at the time. Rather, the efforts to ruin their day are motivated by a desire to cause mayhem for the sake of mayhem. People yelling spoilers while wearing vaguely racist costumes claim to do so for <em>the lulz</em>, because bullying people is amusing.</p>\n<p>If you do not find bullying people amusing, you are, of course, excluded from the in-group. If you think that the conduct of the group is perhaps a bit <em>too libertine</em>, then you are a killjoy who deserves to be the group's victim. This dynamic has existed long before the Internet; it probably also existed in Internet spaces prior to the mid-2000s. It was the mid-2000s when there were the first glimpses of how that dynamic would play into the increasingly central place the Internet and social media played in people's lives.</p>\n<p>The 2023 conflict over <cite>Hogwarts Legacy</cite> is, by contrast, quite political. The opposition to the opposition, coming from the right-wing, is the familiar kind of contrarianism that seeks to dismiss and oppose any sort of concern that the more progressive elements bring up—even if that concern is that trans people should not be subjected to genocide.</p>\n<p>One could say the 2023 spoilers are not coming from the opposite side of the political spectrum, relative to the mid-2000s spoilers, because the mid-2000s spoilers were not coming from a political camp at all. However, it <em>feels</em> like the mid-2000s spoilers were coming from the modern right, because the modern right is essentially what the milieu that brought us the 2000s spoilers eventually grew into.</p>\n<h2 id=\"the-arc-of-the-world-(of-which-the-internet-is-part)\">The arc of the world (of which the Internet is part)</h2>\n<p>Since, over the span of 2005 to 2023, the Internet has shifted from being a separate thing to being an integral part of the <i>real world</i>, we should examine the broader context of how things changed.</p>\n<p>The antisemitic caricatures used for the goblins in the Harry Potter universe were Rowling's invention, long before they found their way to the latest video game. With the rise of Rowling as a prominent public bigot, people have started pointing out that her Harry Potter books have had bigoted elements in them from the start. Whether it is the lazy use of heavy-handed racial stereotypes for minority characters, or seemingly more malicious inclusion of bigoted tropes (like the goblins), it leaves a sour taste for many readers in 2023.</p>\n<p>It would be a mistake to assume that the views we, in 2023, consider progressive or left-wing have emerged whole cloth between 2005 and now. In a lot of cases, what progress entailed was broadening of the portion of the population that hold these views to be true. The opinion that it is fine to treat queer people with disdain is less common in 2023 than it was in 2005, but even in 2005 and before, we had people who believed that it was wrong.</p>\n<p>The Internet has a lot to do with this state of things. The Internet has provided a vector for exposing people to, and convincing people of views that may otherwise be considered radical. Radicalization does not only work in one direction, however. The culture which spawned <cite>Deathly Hallows</cite> spoiling events is what eventually evolved into one of the components of the modern Internet political right. One need look no further than the fact that the 2023 4chan is commonly considered one of the hives of that broad group.</p>\n<h2 id=\"what-the-arc-looks-like\">What the arc looks like</h2>\n<p>There are two ways this can be looked at.</p>\n<p>On one hand, we started out with some weird nerds who, freed from some of the constraints and limitations that life otherwise imposed on them, found spaces on the Internet where they could indulge in whatever they wished, with no regard for the consequences for themselves or others. As the Internet increasingly merged with the so-called <i>real world</i>, these consequences also increasingly became real-world themselves, which led to more pronounced opposition. The erstwhile weird nerds then became reactionaries, joining the broader reactionary currents. Trolling people <i>for the lulz</i> became <i>owning the libs</i>.</p>\n<p>On the other hand, we started with Internet spaces inhabited by weird nerds, with misogyny, racism, and other forms of bigotry that alienate people who are not the right kind of specific weird nerd. The broadening reach of the Internet, combined with general social progress has, however, made more inclusive spaces available to a broader range of weird nerds, and other people. As even people who are capable of staying in toxic spaces often find staying in non-toxic spaces preferable, the average baseline level of toxicity has decreased. Those who do actually prefer their spaces toxic were forced to consolidate, and so became a more coherent reactionary force, but overall the Internet is a better place to be in. If you believe that you should not be subject to genocide, then the Internet can let you reach plenty of people who believe likewise.</p>\n<p>Both of these perspectives are fundamentally true, of course. It is common to lament the world which our interconnectedness has brought us. Perhaps it is important to also acknowledge the ways in which our world is now better, if only because it shows that things <em>can</em> get better, which means that there is a point to it all. Knowing what is wrong, and how things got to be wrong is important, but so is believing that we can win.</p>\n<p>So, you know, fuck the wizard game.</p>\n"},{"id":"https://dee.underscore.world/blog/export-subst-reproducibility/","url":"https://dee.underscore.world/blog/export-subst-reproducibility/","title":"Git export substitutions and reproducibility","date_published":"2022-06-10T19:34:42.000Z","date_modified":"2022-06-10T19:34:42.000Z","content_html":"<p>There is an issue that I have encountered some months ago (and <a href=\"https://fedi.underscore.world/notice/A9IFDdACGB3JqJCd7Y\" title=\"First post of my thread on the issue\">mentioned on the Fediverse back then</a>), but I am still occasionally reminded of it when troubleshooting weird behavior with Nix, so it might be interesting to take a closer look at it. In short: when using Git's export substitutions, a tarball exported by Git from the repository at a particular revision may not always be the exact same, depending on external factors.</p>\n<p>To understand how this quirk affects Nix, we have to both understand what Nix's fixed-output derivations are, as well as how some rather obscure Git features interact with Git internals.</p>\n<h2 id=\"what's-a-derivation\">What's a derivation</h2>\n<p>If you have had passing contact with Nix and Nixpkgs, you may have heard people refer to packages from Nixpkgs as <i>derivations</i>. A <dfn>derivation</dfn> is, broadly, a thing that describes a build action that Nix can take.</p>\n<p>Derivations take as inputs whatever is needed to carry out the build action, contain a script that specifies how to do the build, and register whatever that script produced as outputs. Suppose a simple C program: the inputs to its derivation would include things like the source code of the program, a C compiler, Make, perhaps some library that the program needs; the build script would call <code>make</code> and <code>make install</code>; and the output produced would be the executable binary that <code>make install</code> copied out.</p>\n<figure class=\"figwide\" id=\"figure-1\">\n<img src=\"https://dee.underscore.world/blog/export-subst-reproducibility/one-derivation.svg\" alt=\"Diagram of a derivation with several inputs and one output\">\n<figcaption>The derivation for <code>simple-c-program</code>, depicted as a diagram</figcaption>\n</figure>\n<p>Inputs to our derivation are outputs of other derivations. Those derivations can, of course, have their own inputs, which are other derivations, and so on—this is how we build our dependency graph.</p>\n<figure class=\"figwide\" id=\"figure-2\">\n<img src=\"https://dee.underscore.world/blog/export-subst-reproducibility/derivations.svg\" alt=\"Diagram of several derivations, with outputs of some connected to inputs of others\">\n<figcaption>Several derivations and how they depend on each other. Note that the output of <code>gcc</code> is fed into two different derivation</figcaption>\n</figure>\n<p>This is also where Nix's functional nature comes in: a derivation whose input derivations are unchanged, and whose build script is unchanged is assumed to always produce the same output—like a pure function. This is what allows for a Nix binary cache: instead of building the derivation ourselves, we can download the output of the same derivation built on some other machine, because that output is assumed to be the same.</p>\n<h2 id=\"what's-a-fixed-output-derivation\">What's a fixed output derivation</h2>\n<p>In our previous example, one of the inputs to the simple C program derivation was its source code. This presents a problem: source code is not really the output of a build process that takes other inputs. Instead, source code is an input from the outside world.</p>\n<p>To address this, Nix has a special type of a derivation: a <dfn>fixed-output derivation</dfn> (often abbreviated as <dfn>FOD</dfn>). Like with ordinary derivations, fixed-output ones have inputs, a build script, and produce an output, but in addition they also contain the expected cryptographic hash value of the produced output.</p>\n<p>After Nix carries out the build action specified by a fixed-output derivation's build script, it hashes the produced output, and checks it against the pre-recorded expected hash. If the hashes do not match, the build is considered to have failed. Unlike with ordinary derivations, a fixed-output derivation is considered unchanged if its expected hash remains unchanged—its script and inputs can change, as long as it produces the same exact thing as before.</p>\n<p>While builds carried out from ordinary derivations do not have network access, fixed-output derivation builds do. In practice, fixed-output derivations are thus often used to fetch source code from the Internet. Because we have to specify the hash of the source code to be fetched, we can be reasonably certain that we are fetching the same source code every time we rebuild the derivation.</p>\n<h2 id=\"git-export-substitutions\">Git export substitutions</h2>\n<p>Git has a handy command for exporting the tree at a given commit to a single archive file (tar or ZIP): <code>git archive</code>. This is useful for situations such as distributing a source code release: simply <code>git archive</code> the repository at the relevant tag, and publish the resulting tarball.</p>\n<p>Not having the Git repository around can pose some problems, though. For example, some build processes expect to be able to discover the current Git commit hash, because they embed it in the version information of the binary they produce. To address this, Git provides means of doing export substitutions. Using <code>.gitattributes</code>, one can specify a list files that should have placeholder substitution performed on them when the repository is exported to archive. These placeholders can be for things like the current commit hash (including in its abbreviated version), the output of <code>git describe</code>, or metadata like the commit author and date.</p>\n<h2 id=\"abbreviated-commit-hashes\">Abbreviated commit hashes</h2>\n<p>In many places in Git's UI, it uses abbreviated commit hashes. While a full Git commit hash is 40 hexadecimal digits long (as SHA-1 hashes generally are), commits can usually be unambiguously referred to with some smaller amount of digits.</p>\n<p>The bigger a repository—or, more precisely, the more objects a repository contains—the larger the probability that two object hashes will share the same prefix of some length. A long time ago (before 2016), Git used to default to 7 digits. As time went on and some repositories grew in size, it turned out that 7 digits, or even 8, 9, and larger amounts were not enough to unambiguously refer to objects in those repositories. Because of this, a heuristic <a href=\"https://github.com/git/git/commit/e6c587c733b4634030b353f4024794b08bc86892\" title=\"Git commit e6c587c733b4634030b353f4024794b08bc86892\">was added</a> to Git to estimate how long an abbreviated hash needs to be, based on the estimate of the object count of the repository. This is not an exact determination of the minimum unambiguous length of a hash, but rather a relatively fast guess. It is used by default in various outputs of Git, though a fixed length can be set via configuration.</p>\n<h2 id=\"github-tarball-exports-and-fetchfromgithub\">Github tarball exports and <code>fetchFromGitHub</code></h2>\n<p>Github repository pages offer an option of downloading the current tree as an archive file.</p>\n<figure class=\"figwide\" id=\"figure-3\">\n<picture>\n<source srcset=\"https://dee.underscore.world/blog/export-subst-reproducibility/download-zip.webp\" type=\"image/webp\">\n<img src=\"https://dee.underscore.world/blog/export-subst-reproducibility/download-zip.png\" alt=\"&quot;Code&quot; popup menu on Github, with a &quot;Download ZIP&quot; link.\">\n</picture>\n<figcaption>Example &quot;Download ZIP&quot; link (this one for Nixpkgs)</figcaption>\n</figure>\n<p>This functionality internally uses <code>git export</code>, which means that the downloaded ZIP has the relevant placeholders substituted with actual values. The archive downloads are also available under a predictable URL, which is convenient for scripting.</p>\n<p>Nixpkgs contains a <code>fetchFromGitHub</code> function for—predictably enough—creating fixed-output derivations that fetch source code from Github. As an optimization, when possible, <code>fetchFromGitHub</code> will opt for downloading an archive tarball (Github does provide both <code>tar.gz</code> and <code>zip</code> archives). This is desirable, because we usually do not need a Git repository clone, and the tarball is a compressed (thus smaller) archive that can be quickly and easily downloaded over HTTPS.</p>\n<h2 id=\"the-problem\">The problem</h2>\n<p>Consider a repository that uses Git export substitutions to place the abbreviated hash somewhere in the exported archive file. If we use this exported file in a fixed-output derivation, it will work as long as the length of the abbreviated hash does not change. If it does change, the cryptographic hash of the archive will be different, and our fixed-output derivation's build will start failing.</p>\n<p>The length Git picks for the abbreviated hash can change over time—if people keep making new commits in the repository, the amount of objects Git keeps track of will increase, and the heuristic will pick more digits for the abbreviated hash. Under this assumption, our <code>fetchFromGitHub</code> derivation can become invalid at some point after we write it, even if we are still fetching the same revision.</p>\n<p>In practice, however, this is more complicated. Practical tests show that repeatedly downloading a repository archive from the same URL, within a short span of time, can result in getting versions that include both shorter and longer substituted hashes, seemingly at random.</p>\n<p>We can speculate on why that is. Consider that the hash length estimation is based on the number of all objects in the Git repository, which does not necessarily include <em>only</em> the objects within the current commit tree (which is to say objects associated with the current and previous commits). Such a Git repository can include other things, such as orphaned commits which have not been garbage collected yet, or branches for things like pending pull requests. Github could, conceivably, have several servers which hold copies of a given repository, with each containing the entire <code>main</code> branch, but some of these servers could have differing subsets of other objects.</p>\n<p>If our request can potentially go to any of those servers for load-balancing purposes, then we could end up with different tarballs based on how many objects the given server holds. Yet another reason would be some form of caching, where one cache server holds a tarball generated at a time when a shorter hash sufficed, while another has to regenerate the tarball from the current repository state. The details are opaque to us since we are (presumably) not Github, but we can certainly come up with plausible scenarios.</p>\n<h2 id=\"solutions\">Solutions</h2>\n<p>The obvious solution is to not use abbreviated hashes in export substitutions. If the commit hash is to be embedded somewhere in the built artifacts, using the full hash ensures a far smaller chance of a collision in the future either way (if the full SHA-1 hash collides, then we are really in trouble).</p>\n<p>Outside of that, <code>fetchFromGitHub</code> can be forced to download via Git, rather than via an exported archive. This can complicate the build process, but a Git checkout should be entirely reproducible (Git commits are, after all, referenced by cryptographic hashes themselves).</p>\n<p>Generally, when building from a tagged release, embedding Git revision hashes may not be necessary. The tag will exist in the repository and point to the given revision, and in general should not be moved anyway. Packages in Nixpkgs are most commonly tagged releases, rather than arbitrary commits from the trunk branch, and those commonly embed version numbers, rather than the precise commit hash.</p>\n<h2 id=\"further-reading\">Further reading</h2>\n<ul>\n<li>&quot;<a href=\"https://git-scm.com/docs/gitattributes\">gitattributes</a>&quot;, Git manual – describes the <code>export-subst</code> feature</li>\n<li><a href=\"https://github.com/NixOS/nixpkgs/issues/84312\">fetchFromGitHub is not reproducible when export-subst is used</a> – a related, but different <code>export-subst</code> Nix-related issue</li>\n<li><a href=\"https://nixos.org/manual/nix/stable/introduction.html\">Nix manual</a> – describes, among other things, how Nix derivations work</li>\n<li><a href=\"https://nixos.org/guides/nix-pills/index.html\">Nix Pills</a> – a popular introductory guide to Nix, which provides a more tutorial-like introduction to Nix, and how derivations work</li>\n</ul>\n"}]}