coredns: fix ENOTFOUND for own zone, enable dns64 for IPv4 clients
Two Corefile changes: - Add lumpiasty.xyz server block without dns64. Replaces the manual RouterOS static FWD entry (\"bypass nat64\") which returned NOERROR with empty answer instead of relaying NXDOMAIN. Combined with ndots:5 and pod search domains this made getaddrinfo stop at the search-suffixed candidate and fail with ENOTFOUND for valid names (kaneo -> authentik OAuth fetch failures). CoreDNS relays rcodes faithfully; internal zone keeps real AAAA for native IPv6. - Add allow_ipv4 to dns64 (previously uncommitted): without it only queries arriving over IPv6 are synthesized, but all clients reach CoreDNS via RouterOS over IPv4, so translate_all never applied. The RouterOS static FWD entry must be removed after deploying the new image - ansible already declares only the ts.net entry, so a playbook run handles it.
This commit is contained in:
@@ -192,11 +192,12 @@
|
||||
forward-to: 100.100.100.100
|
||||
match-subdomain: true
|
||||
comment: Tailscale MagicDNS
|
||||
- name: lumpiasty.xyz
|
||||
type: FWD
|
||||
forward-to: 1.1.1.1
|
||||
match-subdomain: true
|
||||
comment: lumpiasty.xyz bypass nat64
|
||||
# Do NOT add a lumpiasty.xyz FWD entry here. RouterOS FWD entries return
|
||||
# NOERROR with an empty answer instead of relaying NXDOMAIN, which breaks
|
||||
# getaddrinfo search-domain processing (ENOTFOUND for valid names in k8s
|
||||
# pods). The DNS64 bypass for our own zone lives in the CoreDNS Corefile
|
||||
# (mikrotik/coredns/Corefile, lumpiasty.xyz server block) which relays
|
||||
# rcodes correctly. See docs/coredns-nat64.md pitfall #4.
|
||||
handle_absent_entries: remove
|
||||
handle_entries_content: remove_as_much_as_possible
|
||||
|
||||
|
||||
@@ -20,7 +20,7 @@
|
||||
data:
|
||||
- dst: /var/lib/tailscale
|
||||
list: tailscale_state
|
||||
src: tailscale/state
|
||||
src: /tailscale/state
|
||||
handle_absent_entries: remove
|
||||
handle_entries_content: remove_as_much_as_possible
|
||||
|
||||
|
||||
@@ -72,6 +72,15 @@
|
||||
comment: Allow Tayga NAT64 pool to internet
|
||||
out-interface: pppoe-gpon
|
||||
src-address: 192.168.240.0/20
|
||||
# IPv6-only clients reaching internal services published on the public IP
|
||||
# (e.g. authentik.lumpiasty.xyz -> 139.28.40.212 -> dst-nat -> 10.44.0.0/16)
|
||||
# arrive from the Tayga pool after NAT64 translation. Without this rule
|
||||
# they fall through to the final reject (hairpin via NAT64).
|
||||
- action: accept
|
||||
chain: forward
|
||||
comment: Allow Tayga NAT64 pool to LoadBalancer (hairpin port forwards)
|
||||
dst-address: 10.44.0.0/16
|
||||
src-address: 192.168.240.0/20
|
||||
- action: jump
|
||||
chain: forward
|
||||
comment: Allow port forwards
|
||||
@@ -446,6 +455,11 @@
|
||||
comment: Allow from IOT to internet only
|
||||
in-interface: vlan5
|
||||
out-interface-list: wan
|
||||
- action: accept
|
||||
chain: forward
|
||||
comment: Allow from SRV to internet via NAT64
|
||||
in-interface: vlan4
|
||||
out-interface: nat64
|
||||
- action: accept
|
||||
chain: forward
|
||||
comment: Allow from IOT to internet via NAT64
|
||||
|
||||
+25
-6
@@ -67,19 +67,38 @@ Provides:
|
||||
- Static IPv6 route `64:ff9b::/96 → Tayga`
|
||||
- Masquerade of Tayga's IPv4 pool to WAN
|
||||
- PREF64 option in Router Advertisements (`/ipv6/nd pref64`)
|
||||
- DHCP option 108 to signal IPv6-only preference to capable clients
|
||||
- PREF64 + RDNSS options in Router Advertisements (per-interface `ipv6 nd` entries)
|
||||
- DHCP option 108 to signal IPv6-only preference to capable clients (sent only when requested)
|
||||
|
||||
## Client behaviour with DHCPv4 option 108
|
||||
|
||||
Option 108 and PREF64 work as a pair — deploying one without the other breaks clients:
|
||||
|
||||
- **Option 108** (RFC 8925): tells capable clients to drop IPv4. RouterOS only sends it to clients that request code 108 in their Parameter Request List (that is what the `force` flag on the option controls — we leave it unset). Legacy clients never see it.
|
||||
- **PREF64 in RA** (RFC 8781): tells the now IPv6-only client the NAT64 prefix so it can activate CLAT. Without PREF64, a client that honoured option 108 has no working translation and appears stuck "obtaining IP address".
|
||||
- **RDNSS in RA** (RFC 8106): IPv6-only clients ignore DHCPv4 entirely, including its `dns-server`. They need an IPv6 DNS address from RA. We advertise the router's per-VLAN IPv6 address; RouterOS DNS forwards to CoreDNS.
|
||||
|
||||
| Client OS | Behaviour |
|
||||
|---|---|
|
||||
| iOS 16+, macOS 13+ | Activates CLAT, drops IPv4, uses NAT64 for IPv4 literals |
|
||||
| Android 10+ | Activates CLAT via PREF64, drops IPv4 |
|
||||
| iOS 16+, macOS 13+ | Requests 108, drops IPv4, activates CLAT via PREF64 |
|
||||
| Android 10+ | Requests 108, drops IPv4, activates CLAT via PREF64 |
|
||||
| Windows 11 (preview) | Partial — CLAT support in preview as of 2026 |
|
||||
| Linux (NetworkManager) | DHCP option 108 honoured; CLAT via NM requires PREF64 |
|
||||
| Legacy/unaware devices | Ignore option 108, receive IPv4 lease normally, continue dual-stack |
|
||||
| Linux (NetworkManager) | Honours option 108; CLAT requires PREF64 |
|
||||
| Legacy/unaware devices | Never request 108, receive IPv4 lease normally, dual-stack |
|
||||
|
||||
Option 108 value is a 32-bit seconds timer. Set to 28 (0x1c) for testing, 86400 for production.
|
||||
Option 108 value is a 32-bit seconds timer (V6ONLY_WAIT, minimum 300 per RFC), refreshed on each DHCP renewal. We use 86400 (1 day) so a failed DNS64/NAT64 stack self-heals within a day by clients falling back to IPv4.
|
||||
|
||||
### Deployment pitfalls (learned the hard way)
|
||||
|
||||
Option 108 must never be deployed before the whole IPv6-only path works end to end. A client that honours it drops IPv4 immediately and depends on RA-provided PREF64 + RDNSS and a working NAT64. Each of these failure modes was hit in sequence, and every one presented identically on the phone ("stuck obtaining IP address" / "failed to connect"):
|
||||
|
||||
1. **ND entries silently not created.** RouterOS ships only the `interface=all` default in `/ipv6/nd`. An `api_find_and_modify` task searching for `interface=vlan2` matches zero entries and silently succeeds (`require_matches_min` defaults to 0) — PREF64 was never advertised. Use `api_modify`, which creates missing entries.
|
||||
2. **RDNSS pointing at a nonexistent address.** VLAN IPv6 addresses came `from-pool`, so the actual prefix was dynamic (`:0::/64`), while the ND `dns=` advertised the documented-but-wrong `:9::/64` router address. Fixed by switching VLANs to static addressing — the HE prefix is static, the pool indirection served no purpose.
|
||||
3. **`advertise-dns=no` on new ND entries.** RouterOS creates per-interface ND entries with `advertise-dns=no`, which suppresses the RDNSS option entirely — even when a static `dns=` list is configured on the entry. Must be set to `yes` explicitly.
|
||||
|
||||
4. **RouterOS static FWD entries corrupt NXDOMAIN.** A manually added `type=FWD match-subdomain=yes` entry for `lumpiasty.xyz` (intended to bypass DNS64 for our own zone) returned `NOERROR` with an empty answer for nonexistent subdomains instead of relaying NXDOMAIN. Combined with `ndots:5` and the `homelab-infra.lumpiasty.xyz` search domain in kubernetes pods, `getaddrinfo` received NODATA for the search-suffixed candidate (`authentik.lumpiasty.xyz.homelab-infra.lumpiasty.xyz`), concluded the name exists, stopped the search loop, and never tried the absolute name — apps failed with `ENOTFOUND` for perfectly valid hostnames while `nslookup` (absolute query) worked. The zone bypass now lives in the CoreDNS Corefile as a dedicated `lumpiasty.xyz:53` server block without `dns64`, which relays rcodes faithfully. RouterOS DNS does plain forwarding only; no FWD entries except Tailscale MagicDNS.
|
||||
|
||||
Verification tooling: `rdisc6` (NixOS package `ndisc6`) shows the exact RA contents — RDNSS and PREF64 must both be present. When capturing DHCP in Wireshark, do not filter by client MAC: OFFER/ACK are sent to the broadcast MAC and disappear from the capture, hiding the server side of the exchange. When diagnosing DNS, the CoreDNS `log` plugin output is visible via `/log print` on the router (container `logging=yes`) and includes the rcode CoreDNS returned — comparing it with what the client received isolates which hop corrupts responses. Beware misleading test names: `*.example.com` legitimately returns NODATA upstream, making it useless for NXDOMAIN testing.
|
||||
|
||||
## CI/CD
|
||||
|
||||
|
||||
+11
-2
@@ -93,8 +93,12 @@ There are also networks, which are not VLANs, but are routed:
|
||||
Static assignment on CRS, access to factory IP of ONU
|
||||
- Containers on CRS<br>
|
||||
Access to every other network<br>
|
||||
IP: 172.17.0.1/16, 2001:470:61a3:500::/64<br>
|
||||
Static IP management
|
||||
IP: 172.20.0.1/24, 2001:470:61a3:500::/64<br>
|
||||
Static IP management, hosts Tailscale and CoreDNS (DNS64) containers
|
||||
- NAT64 link on CRS<br>
|
||||
Dedicated bridge for the Tayga NAT64 container<br>
|
||||
IP: 192.168.239.0/30, fc64::/126 (link), 192.168.240.0/20 (Tayga dynamic pool)<br>
|
||||
IPv6 traffic to 64:ff9b::/96 is routed here for translation to IPv4
|
||||
|
||||
Whole network is designed to eliminate VLANs, overlays where unnecessary to keep things simple. Only NAT rules are:
|
||||
|
||||
@@ -103,8 +107,13 @@ Whole network is designed to eliminate VLANs, overlays where unnecessary to keep
|
||||
It doesn't have a gateway configured, we want to access it from other networks so we need to talk to it as if we were in the same subnet
|
||||
- src-nat tailscale IPv6 to internet<br>
|
||||
Tailscale assigns IPv6 from private subnet with no way to configure it, so the assigned IPs are not routable
|
||||
- Masquerade Tayga NAT64 dynamic pool (192.168.240.0/20) via GPON PPPoE
|
||||
- IPv4 port forwards from GPON PPPoE to respective services
|
||||
|
||||
## IPv6-mostly (NAT64/DNS64)
|
||||
|
||||
LAN (vlan2) and IoT (vlan5) are IPv6-mostly networks (RFC 8925): clients capable of IPv6-only operation receive DHCP option 108, drop their IPv4 address, and activate CLAT using the NAT64 prefix advertised via PREF64 in router advertisements. Legacy clients keep dual-stack. DNS64 (CoreDNS container, with `translate_all`) synthesizes 64:ff9b::/96 AAAA answers so all named traffic exits via NAT64 (Tayga container) on our IPv4 WAN — bypassing the HE tunnel for egress and avoiding datacenter-IP captcha flagging. See [CoreDNS DNS64 + NAT64 design](./coredns-nat64.md) for details and deployment pitfalls.
|
||||
|
||||
There is also an UPnP and NAT-PMP enabled to automatically configure port forwards from LAN.
|
||||
|
||||
## Uplink
|
||||
|
||||
@@ -1,3 +1,22 @@
|
||||
# Our own zone bypasses DNS64: internal services have native IPv6 (LB pool
|
||||
# routed via HE prefix), so clients should get real AAAA records and connect
|
||||
# directly instead of hairpinning through NAT64.
|
||||
#
|
||||
# This MUST live here, not as a RouterOS static FWD entry: RouterOS FWD
|
||||
# entries return NOERROR with an empty answer instead of relaying NXDOMAIN,
|
||||
# which breaks getaddrinfo search-domain processing (resolver stops at the
|
||||
# first NODATA search candidate and never tries the absolute name -> apps
|
||||
# fail with ENOTFOUND for names that exist).
|
||||
lumpiasty.xyz:53 {
|
||||
forward . 1.1.1.1 8.8.8.8 {
|
||||
prefer_udp
|
||||
}
|
||||
|
||||
cache 300
|
||||
errors
|
||||
log
|
||||
}
|
||||
|
||||
.:53 {
|
||||
# Synthesize AAAA from A records for all destinations.
|
||||
# translate_all: override real AAAA records too, so all traffic exits
|
||||
|
||||
Reference in New Issue
Block a user