Get rid of NAT64 setup
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
ci/woodpecker/push/coredns-build Pipeline was successful

This commit is contained in:
2026-06-16 00:29:18 +02:00
parent b993115b41
commit 679ebb3465
13 changed files with 316 additions and 419 deletions
+110
View File
@@ -0,0 +1,110 @@
# CoreDNS resolver
## Goal
Replace the RouterOS built-in DNS forwarder with a CoreDNS container for
configurability, and suppress IPv6 (AAAA) resolution by default to keep traffic
on IPv4.
## Background
The ISP provides no native IPv6 — only a Hurricane Electric (HE) tunnel
(`2001:470:61a3::/48`). HE addresses fall in ranges some sites flag as
datacenter/bot traffic, producing endless CAPTCHAs. The goal is to prefer IPv4
egress while keeping IPv6 available for our own services and any domain
explicitly trusted over IPv6.
## What this is NOT (and why)
An earlier iteration used **DNS64 + NAT64 (Tayga)** to force traffic through
IPv4. It was removed:
- **Performance**: Tayga is a userspace translator with no hardware offload.
Every translated packet crossed RouterOS twice (v6 in, v4 out) plus a
userspace hop, capping throughput at ~250 Mbps against a 1 Gbps line.
- **SPOF**: two containers (CoreDNS + Tayga) in the datapath of nearly all
traffic on a router whose native forwarder had been rock-solid.
- **Architectural inversion**: NAT64 exists to let IPv6-only clients reach IPv4.
We don't want IPv6 egress at all — using NAT64 to avoid IPv6 was solving the
problem backwards.
Plain AAAA suppression in CoreDNS achieves the same IPv4-preferred outcome with
zero datapath overhead — DNS is the only thing touched, packet forwarding stays
on the RouterOS fastpath at line rate.
The full account of the NAT64/IPv6-mostly attempt and why it was abandoned is in
[nat64-dns64-postmortem.md](./nat64-dns64-postmortem.md).
## How it works
CoreDNS runs as a single container (`172.20.0.3`), reachable from RouterOS DNS
which forwards client queries to it. The [Corefile](../mikrotik/coredns/Corefile)
has three server blocks:
1. **`lumpiasty.xyz`** — our own zone. Forwards normally, keeps AAAA, so internal
services reachable over the HE prefix resolve to their real IPv6 addresses.
2. **`.` (default)** — forwards everything else, but a `template IN AAAA` block
returns empty NOERROR for all AAAA queries, so clients fall back to IPv4 and
avoid the HE tunnel's flagged egress. A records and all other types pass
through untouched.
The whitelist is implemented as a reusable `(aaaa_allowed)` snippet imported by
zones that should keep AAAA. To trust another domain over IPv6, add a server
block for it that imports `aaaa_allowed`.
### Why suppression, not NXDOMAIN
The AAAA template returns NOERROR with an empty answer (NODATA), not NXDOMAIN.
This is correct: the name exists, it just has no (advertised) AAAA. Clients
treat it as "no IPv6 address" and use the A record. Returning NXDOMAIN would
wrongly imply the name doesn't exist and break the A lookup.
## Future improvement
The current global-suppress-plus-whitelist is coarse: a domain that is genuinely
IPv6-only (no A record) and not whitelisted becomes unreachable. The intended
end state is a plugin that suppresses AAAA only when the domain also has an A
record, so IPv6-only destinations keep working without manual whitelisting. No
in-tree CoreDNS plugin does this today.
## Custom image
Built from source with a minimal plugin set (`errors`, `log`, `health`,
`template`, `cache`, `forward`, `reload`) instead of the default ~40, producing
a ~6-8 MB image. The `dns64` plugin is no longer compiled in.
Source: [`mikrotik/coredns/`](../mikrotik/coredns/). Built by Woodpecker
([`.woodpecker/coredns-build.yaml`](../.woodpecker/coredns-build.yaml)) on pushes
touching `mikrotik/coredns/**`, pushed to `gitea.lumpiasty.xyz/lumpiasty/coredns-mikrotik`.
## RouterOS integration
- `/ip/dns servers=172.20.0.3` — RouterOS forwards client queries to CoreDNS
- RDNSS in RA (`/ipv6/nd dns=...` on vlan2/vlan5) advertises an IPv6 resolver
(the router's per-VLAN address) to dual-stack clients; RouterOS DNS relays to
CoreDNS
- No DHCP option 108, no PREF64 — those belonged to the removed IPv6-mostly setup
## Pitfalls learned (kept for reference)
These were hit during the NAT64 era and the migration; some still apply:
1. **RouterOS static FWD entries corrupt NXDOMAIN.** A `type=FWD match-subdomain=yes`
entry returns NOERROR/empty instead of relaying NXDOMAIN. Combined with
`ndots:5` and kubernetes pod search domains, `getaddrinfo` stops at the first
search-suffixed NODATA candidate and never tries the absolute name — apps fail
with `ENOTFOUND` for valid hostnames while `nslookup` (absolute query) works.
Our own zone is therefore handled in the Corefile, not via a RouterOS FWD
entry. RouterOS DNS does plain forwarding only (plus the Tailscale `ts.net`
FWD, which is acceptable as its subdomains genuinely don't exist publicly).
2. **`advertise-dns=no` on new ND entries.** RouterOS creates per-interface
`ipv6 nd` entries with `advertise-dns=no`, suppressing the RDNSS option even
when a static `dns=` list is set. Must be enabled explicitly.
3. **Per-interface ND entries must be created, not modified.** Only the
`interface=all` default ships out of the box; `api_find_and_modify` matching a
specific interface silently matches nothing. Use `api_modify`.
Verification: `rdisc6` (NixOS package `ndisc6`) dumps RA contents. The CoreDNS
`log` plugin output is visible via `/log print` on the router (container
`logging=yes`) and shows the rcode CoreDNS returned — comparing it to what the
client received isolates which hop corrupts a response.