5 Commits

Author SHA1 Message Date
Renovate 98e140df2c Update renovate/renovate Docker tag to v43.233.4 2026-06-21 02:01:30 +00:00
Lumpiasty 4034628449 Fast fail connection when WAN failover
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
ci/woodpecker/cron/renovate Pipeline was successful
2026-06-21 02:38:24 +02:00
Lumpiasty 1e86dc5e2b Detect GPON blackhole using ping
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
2026-06-21 02:00:32 +02:00
Renovate b1751ec427 Merge pull request 'Update Helm release openebs to v4.5.1' (#339) from renovate/openebs-4.x into fresh-start
ci/woodpecker/push/flux-reconcile-source Pipeline was successful
ci/woodpecker/cron/renovate Pipeline was successful
2026-06-19 02:01:13 +00:00
Renovate b9458c46bd Update Helm release openebs to v4.5.1 2026-06-19 02:01:09 +00:00
6 changed files with 104 additions and 13 deletions
+1 -1
View File
@@ -21,7 +21,7 @@ steps:
- bao kv get -mount secret -field RENOVATE_TOKEN renovate > /woodpecker/renovate_token
- bao kv get -mount secret -field GITHUB_COM_TOKEN renovate > /woodpecker/github_com_token
- name: Run Renovate
image: renovate/renovate:43.222.1
image: renovate/renovate:43.233.4
environment:
RENOVATE_AUTODISCOVER: "true"
RENOVATE_ENDPOINT: https://gitea.lumpiasty.xyz/api/v1
+48
View File
@@ -1,8 +1,56 @@
---
- name: Configure WAN connection marking
community.routeros.api_modify:
path: ip firewall mangle
data:
- action: mark-connection
chain: forward
connection-state: new
new-connection-mark: wan-gpon
out-interface: pppoe-gpon
passthrough: true
comment: Mark connections going out GPON
- action: mark-connection
chain: forward
connection-state: new
new-connection-mark: wan-lte
out-interface: vlan6
passthrough: true
comment: Mark connections going out LTE
handle_absent_entries: remove
handle_entries_content: remove_as_much_as_possible
ensure_order: true
- name: Configure IPv4 firewall filter rules
community.routeros.api_modify:
path: ip firewall filter
data:
- action: reject
chain: forward
connection-mark: wan-gpon
out-interface: vlan6
protocol: tcp
reject-with: tcp-reset
comment: Fast-fail TCP connections that shifted from GPON to LTE
- action: reject
chain: forward
connection-mark: wan-gpon
out-interface: vlan6
reject-with: icmp-network-unreachable
comment: Fast-fail non-TCP connections that shifted from GPON to LTE
- action: reject
chain: forward
connection-mark: wan-lte
out-interface: pppoe-gpon
protocol: tcp
reject-with: tcp-reset
comment: Fast-fail TCP connections that shifted from LTE to GPON
- action: reject
chain: forward
connection-mark: wan-lte
out-interface: pppoe-gpon
reject-with: icmp-network-unreachable
comment: Fast-fail non-TCP connections that shifted from LTE to GPON
- action: fasttrack-connection
chain: forward
connection-state: established,related
+34 -4
View File
@@ -12,15 +12,44 @@
scope: 30
suppress-hw-offload: false
target-scope: 10
- disabled: false
- comment: GPON Monitor 1
disabled: false
distance: 1
dst-address: 1.0.0.1/32
gateway: pppoe-gpon
routing-table: main
scope: 10
suppress-hw-offload: false
target-scope: 10
- comment: GPON Monitor 2
disabled: false
distance: 1
dst-address: 8.8.4.4/32
gateway: pppoe-gpon
routing-table: main
scope: 10
suppress-hw-offload: false
target-scope: 10
- comment: GPON Default 1
disabled: false
distance: 1
dst-address: 0.0.0.0/0
gateway: pppoe-gpon
gateway: 1.0.0.1
check-gateway: ping
routing-table: main
scope: 30
suppress-hw-offload: false
target-scope: 10
vrf-interface: pppoe-gpon
target-scope: 11
- comment: GPON Default 2
disabled: false
distance: 2
dst-address: 0.0.0.0/0
gateway: 8.8.4.4
check-gateway: ping
routing-table: main
scope: 30
suppress-hw-offload: false
target-scope: 11
handle_absent_entries: remove
handle_entries_content: remove_as_much_as_possible
@@ -32,6 +61,7 @@
distance: 1
dst-address: 2000::/3
gateway: 2001:470:70:dd::1
check-gateway: ping
scope: 30
target-scope: 10
- comment: Tailnet
+1
View File
@@ -10,6 +10,7 @@
password: "{{ routeros_pppoe_password }}"
# Using CoreDNS container with DNS64
use-peer-dns: false
add-default-route: false
user: "{{ routeros_pppoe_username }}"
handle_absent_entries: remove
handle_entries_content: remove_as_much_as_possible
+19 -7
View File
@@ -84,9 +84,10 @@ subnets would fail routing lookup with "net unreachable" without it.
| Destination | Source | Distance | Active when |
|---|---|---|---|
| `0.0.0.0/0` | static via `pppoe-gpon` | 1 | GPON up |
| `1.0.0.1/32`, `8.8.4.4/32` | static via `pppoe-gpon` | 1 | always |
| `0.0.0.0/0` | static via `1.0.0.1`, `8.8.4.4` (recursive) | 1, 2 | GPON ping check succeeds |
| `0.0.0.0/0` | BGP from D-Link via `192.168.6.2` | 200 | wwan up on D-Link |
| `2000::/3` | static via `sit1` (HE tunnel) | 1 | sit1 active (HE tunnel works) |
| `2000::/3` | static via `2001:470:70:dd::1` (HE tunnel) | 1 | HE tunnel ping check succeeds |
| `2000::/3` | BGP from D-Link via `2001:470:61a3:600::2` | 200 | wwan up on D-Link |
RouterOS distance comparison is straightforward: distance 1 always wins
@@ -136,11 +137,12 @@ preferred route for D-Link's own traffic.
- **wwan modem goes down** → BIRD2 device protocol detects wwan0 down →
static `lte_default` / `lte_default6` become unreachable → BGP withdraws
announcements → CRS removes BGP-learned default
- **GPON drops** → `pppoe-gpon` interface down → CRS distance-1 default
route inactive → distance-200 BGP route activates → CRS withdraws its
default-originate announcement to D-Link (since no default is installed
any more) → D-Link's kernel default-via-CRS is removed → D-Link uses
wwan kernel default → traffic flows from CRS via vlan6 → D-Link → wwan
- **GPON drops or blackholes** → recursive ping checks (1.0.0.1, 8.8.4.4) over `pppoe-gpon`
fail (takes ~20s: 10s ping interval + 10s timeout) → CRS distance-1/2 default routes inactive → distance-200 BGP route
activates → CRS withdraws its default-originate announcement to D-Link (loop
prevention prevents reflecting D-Link's own route) → D-Link's kernel
default-via-CRS is removed → D-Link uses wwan kernel default → traffic flows
from CRS via vlan6 → D-Link → wwan
All transitions are automatic and driven by interface state. No active
probing (Netwatch / mwan3), no scripts toggling routes.
@@ -241,6 +243,16 @@ QMI initialization within ~1 second.
Full investigation: see [wwan-bm806c-qmi-workaround.md](./wwan-bm806c-qmi-workaround.md).
## Multi-WAN Stale Connection Tracking
When the routing table fails over from GPON to LTE (or vice versa), RouterOS does not automatically clear existing connection tracking entries. If an established TCP/UDP connection is routed out the new WAN interface, it retains the NAT translation state (source IP) of the old WAN interface. The packet is sent to the ISP with the wrong source IP and is silently dropped, causing clients (like Tailscale) to hang for minutes until their internal sockets time out.
To solve this purely declaratively without scripts or blanket connection flushes, the `forward` chain is configured to "fast-fail" these shifted connections:
1. Connections are marked with their egress WAN upon establishment (`wan-gpon` or `wan-lte`) via the `mangle` table.
2. If an established connection with a `wan-gpon` mark attempts to route out `vlan6` (LTE), or a `wan-lte` mark routes out `pppoe-gpon`, it is explicitly rejected (`tcp-reset` for TCP, `icmp-network-unreachable` for UDP) before reaching the NAT table.
3. This rejection immediately signals the client OS that the route is dead, forcing the application (Tailscale, SIP clients, etc.) to instantly close the socket and establish a new one, which successfully binds to the new WAN interface and NAT state.
## Implementation files
| File | Role |
+1 -1
View File
@@ -23,7 +23,7 @@ spec:
chart:
spec:
chart: openebs
version: 4.5.0
version: 4.5.1
sourceRef:
kind: HelmRepository
name: openebs