From e009040cb4f21dc8e677e6cc91ead6dfab98a421 Mon Sep 17 00:00:00 2001 From: Lumpiasty Date: Wed, 17 Jun 2026 00:30:46 +0200 Subject: [PATCH] add peer api server to remedy DNS --- Dockerfile | 26 +++++++++++++++++ docs/DESIGN.md | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 101 insertions(+) diff --git a/Dockerfile b/Dockerfile index 98b2ccb..dd80d77 100644 --- a/Dockerfile +++ b/Dockerfile @@ -133,6 +133,31 @@ COPY patches/stderr_verbosity_filter.go cmd/tailscaled/ # gro — Generic Receive Offload (perf). Depends on netstack; # pulled in with it. Small, and improves throughput on # the netstack DNS/inject path. +# peerapiserver — REQUIRED to be a functional exit node. In v1.98.5 +# 'advertiseexitnode' DECLARES a dependency on +# peerapiserver (featuretags.go Deps, "to run the ExitDNS +# server"), but this build's allowlist works by stripping +# individual ts_omit_ tags and does NOT re-resolve Deps — +# so featuretags --min still emitted ts_omit_peerapiserver +# and our advertiseexitnode opt-in alone left it omitted. +# peerapiserver gates the entire PeerAPI HTTP server, +# including the /dns-query DoH endpoint (peerapi.go, +# guarded by buildfeatures.HasPeerAPIServer). Without it +# initPeerAPIListenerLocked() returns early: the node +# never advertises the PeerAPIDNS service, so exit-node +# CLIENTS' exitNodeCanProxyDNS(thisNode) returns false. +# With no tailnet global nameserver configured, the +# client's resolver then has an empty Routes["."] and +# returns an INSTANT authoritative SERVFAIL locally +# (forwarder.go servfailResponse, aa=1, 0 ms, no I/O) — +# i.e. devices using this router as their exit node could +# not resolve PUBLIC names. Including peerapiserver makes +# the node serve the exit-node DoH DNS proxy, so clients +# get public DNS automatically (the normal exit-node +# behavior) with no tailnet DNS config required. +# peerapiserver has NO Deps and pulls in no large +# subsystems — a small addition. (outboundproxy is NOT +# needed for this and stays omitted.) # # Everything else remains omitted, including (rationale): # clientupdate — DELIBERATELY removed. The built-in updater would download @@ -180,6 +205,7 @@ RUN mkdir -p /out && \ -e 's/ts_omit_ipnbus,\{0,1\}//g' \ -e 's/ts_omit_netstack,\{0,1\}//g' \ -e 's/ts_omit_gro,\{0,1\}//g' \ + -e 's/ts_omit_peerapiserver,\{0,1\}//g' \ -e 's/,$//' \ ) && \ echo "Build tags: ${TAGS}" && \ diff --git a/docs/DESIGN.md b/docs/DESIGN.md index 1b97f09..2b0d31e 100644 --- a/docs/DESIGN.md +++ b/docs/DESIGN.md @@ -155,6 +155,7 @@ that's a separate build, not just a `--platform` change. | `accept-routes` | Receive subnet routes from other tailnet nodes | | DNS / MagicDNS | Resolve `*.ts.net` names (resolver + resolv.conf manager). **Note:** serving `100.100.100.100` also requires `netstack` — see [Why netstack is required (even with a kernel TUN)](#why-netstack-is-required-even-with-a-kernel-tun) | | `netstack` + `gro` | gVisor userspace stack. Counter-intuitively **required** to serve MagicDNS on `100.100.100.100`, even though the router uses a real kernel TUN — see [Why netstack is required (even with a kernel TUN)](#why-netstack-is-required-even-with-a-kernel-tun) | +| `peerapiserver` | Serves the PeerAPI, including the `/dns-query` DoH endpoint that lets **exit-node clients resolve public DNS automatically**. A declared dependency of `advertise-exit-node` that the allowlist didn't pull in — see [Why peerapiserver is required for exit-node DNS](#why-peerapiserver-is-required-for-exit-node-dns) | | portmapper (NAT-PMP/PCP/UPnP) | Punch through upstream NAT | | listenrawdisco | Raw socket disco for better NAT traversal | | health | Powers `tailscale status` output | @@ -308,6 +309,80 @@ itself is refactored — re-test whether `netstack` can be dropped again. The canary is simple: from inside the container, `dig google.com @100.100.100.100` must return answers and `ping ..ts.net` must resolve. +### Why peerapiserver is required for exit-node DNS + +This is a second non-obvious DNS inclusion, and it exposes a limitation of the +allowlist build strategy. + +**Symptom.** With `netstack` enabled, MagicDNS worked from the router and from +LAN hosts, including public names. But a device using this router **as its exit +node** could not resolve public names: `dig google.com @100.100.100.100` on the +*client* returned an instant authoritative `SERVFAIL` (`flags: qr aa rd ad`, +`Query time: 0 msec`, "recursion not available"). Tailnet names and raw-IP +connectivity (e.g. `ping 1.1.1.1`) through the exit node worked. + +**Root cause.** The `SERVFAIL` is generated **on the client**, locally, with no +network I/O — which is why it is instant and authoritative. The path +(traced through v1.98.5 source): + +1. The client's query for `google.com` reaches its in-process resolver, which + determines the name is not a tailnet name and marks it for forwarding + (`net/dns/resolver/tsdns.go`). +2. The forwarder looks up which upstream resolver to use for the catch-all + `"."` route (`net/dns/resolver/forwarder.go` → `resolvers()`). +3. That route set is **empty**, so `forwardWithDestChan` short-circuits and + synthesises an authoritative `SERVFAIL` (`servfailResponse`, `aa=1`) without + opening any socket. The query never reaches this router at all. + +Why the route set is empty: when a client selects an exit node, +`dnsConfigForNetmap` (`ipn/ipnlocal/node_backend.go`) deliberately routes **all** +default DNS through the exit node and drops the client's own LAN/system +resolver — the whole premise of an exit node is "send everything, including +DNS, through me." It does this by setting the client's default resolver to the +exit node's **DoH proxy** URL (`http:///dns-query`). But that only happens +if `exitNodeCanProxyDNS(thisRouter)` returns true — i.e. if **this router +advertises a working PeerAPI DoH endpoint**. If it does not, and there is no +tailnet global nameserver to fall back to, the client ends up with an empty +default route and returns `SERVFAIL`. + +**Why this router didn't advertise the DoH proxy.** The `/dns-query` DoH +endpoint is part of the **PeerAPI server**, gated by +`buildfeatures.HasPeerAPIServer` (`ipn/ipnlocal/peerapi.go`). With +`ts_omit_peerapiserver`, `initPeerAPIListenerLocked()` returns early: no PeerAPI +listener is created, the `PeerAPIDNS` service is never advertised, and +`peerCanProxyDNS()` is false for this node on every client. + +**The allowlist gap that caused it.** In `feature/featuretags/featuretags.go`, +`advertiseexitnode` **declares a dependency on `peerapiserver`** ("to run the +ExitDNS server"). Upstream's own `--add` resolution would have pulled it in. +But this build's allowlist works differently: it runs `featuretags --min` to get +the full omit set, then strips the specific `ts_omit_` tags it wants — +it does **not** re-resolve transitive `Deps`. So opting in `advertiseexitnode` +did not pull in `peerapiserver`, and `featuretags --min` had emitted +`ts_omit_peerapiserver`, leaving the node an exit node *without* its declared +ExitDNS dependency — a feature combination upstream's graph says shouldn't +occur. Including `peerapiserver` explicitly closes the gap. + +> **Known limitation:** the allowlist (strip-individual-`ts_omit_`-tags) does +> not resolve feature dependencies. When opting a feature in, check its `Deps` +> in `featuretags.go` and add them explicitly. `peerapiserver` is the only such +> gap found and fixed so far; a full dependency audit has not been done. + +**Cost.** Negligible. `peerapiserver` has **no** `Deps` and pulls in no large +subsystems; measured at ~+10 kB on the UPX'd binary (arm64), rootfs unchanged +within measurement noise. + +**Result.** The router now serves the exit-node DoH DNS proxy, so devices using +it as their exit node resolve public names automatically — the normal exit-node +behavior — with **no** tailnet DNS configuration required. (Setting a tailnet +global nameserver in the admin console is an alternative runtime fix that also +works, by populating the client's default resolver directly; it is not required +once the router serves the proxy.) + +**Canary for future bumps:** from a client using this router as exit node, +`dig google.com @100.100.100.100` must return real answers with `flags: ... ra` +(recursion available) and a non-zero query time. + ### Log verbosity filtering Upstream `tailscaled` embeds verbosity tags (`[v1]`, `[v2]`, …) inside its log