Design notes

This document records design decisions worth keeping. It is not a chronological log of past experiments — only the principles current code follows and that future changes should respect.
1. Daemon contract
Stateful work is carried out by dedicated daemons. Every daemon exposes the same surface so tooling can interact with them uniformly:
- HTTP+JSON API over a Unix domain socket
/v1/status/v1/healthz/v1/events/v1/commands/reload/v1/commands/renew/v1/commands/stop- a state or lease file
events.jsonl(append-only)
This contract is shared by routerd-dhcpv6-client, routerd-dhcpv4-client, routerd-pppoe-client, and routerd-healthcheck.
2. DHCPv6-PD
DHCPv6-PD is owned by routerd-dhcpv6-client. There is no longer a code path that generates configuration for an OS-bundled client.
For ordinary residential gateways, the standard solicit / advertise / request / renew sequence with lease persistence and T1 renewal is sufficient. Aggressive retries that were once needed to work around broken environments are no longer the default.
3. Honest LAN advertisement
When DHCPv6-PD is not Bound, routerd does not advertise stale IPv6 information to the LAN. This applies to RA, DHCPv6 server, AAAA records, and any LAN address derived from the prefix. The router stays "broken in an obvious way" rather than handing out addresses that no longer reach upstream.
4. DS-Lite
In some access networks the AFTR option is not returned in DHCPv6 information-request. DSLiteTunnel therefore treats a static aftrFQDN or aftrIPv6 as a normal configuration path, not a fallback.
AFTR FQDNs frequently cannot be resolved by public DNS. Use a DNSForwarder
for the AFTR domain and a DNSUpstream whose addressFrom reads the
in-network resolver from DHCPv6 information status.
5. Event coordination
routerd has an in-process bus. Controllers consume events and reconcile only the resources affected.
The following kinds work together for higher-level coordination:
EgressRoutePolicyEventRuleDerivedEventHealthCheck
EventRule consumes one stream of events and produces another. DerivedEvent synthesises asserted / retracted virtual events from observed state.
6. Tier S building blocks
WireGuard, Tailscale, IPsec, VRF, and VXLAN are the building blocks for the Tier S (SOHO / branch) capability. WireGuard and VXLAN-over-WireGuard interoperability is verified across the supported operating systems. TailscaleNode covers exit-node and subnet-router advertisement without turning every VPN into one abstract shape.
There is no abstract VPNTunnel resource. WireGuard, Tailscale, IPsec, and future SoftEther integrations are added as their own kinds. The motivation is that each of these has materially different state machines; collapsing them into one polymorphic kind would lose semantics.
7. Open work
- Stateful firewall in production:
FirewallRulenow covers ICMP type matching, multiple ports per rule, nftables rate limiting, and per-source connection limits. Future work should focus on rule grouping and higher-level policy ergonomics rather than basic expression coverage. - DoH proxying for clients on the LAN.
- BFD for BGP peers that need sub-second failure detection.
- Operator-controlled IngressService backend drain mode through
routerctlwithout editing YAML. - More production examples for VRRP
advertInterval,preempt, andpreemptDelaytuning. - Validation for listen-port collisions between
IngressService, local service redirects, and routerd-managed daemons. - IPv6 BGP and VRRPv3 for dual-stack Kubernetes clusters.
- OSPF or other dynamic routing integration for the Tier C tier.
- High availability (leader election, fault-tolerant control plane).
- Production observability: OpenTelemetry collector and remote log sinks.
- Long-running validation of routerd as the only WAN router on a residential link.