On Tue, 9 Sep 2025 16:49:20 +0200
Volker Diels-Grabsch
When restarting passt while QEMU keeps running with a configured "reconnect-ms" setting, the port forwardings will stop working until the guest sends some outgoing network traffic.
Reason: Although QEMU reconnects successfully to the unix domain socket of the new passt process, that one no longer knows the guest's MAC address and uses instead the broadcast MAC address. However, this is ignored by the guest, at least if the guest runs Linux. Only after the guest sends some network package on its own initiative, passt will know the MAC address and will be able to establish forwarded connections.
This change fixes this issue by sending an ARP and an NDP request to resolve the guest's MAC address via its IPv4 and IPv6 address, which we do know, right after the unix domain socket (re)connection.
The only case where the IP is "wrong" would be if the configuration changed, or on the very first start right after qemu started. But in those cases, we just wouldn't get an ARP/NDP response, and can't do anything until we receive the guest's DHCP request - just as before. In other words, in the worst case the ARP/NDP requests would be harmless.
Thanks for the implementation, this looks like a small but quite relevant feature we missed until now. I have a couple of comments on top of David's ones:
Signed-off-by: Volker Diels-Grabsch
--- arp.c | 33 +++++++++++++++++++++++++++++++++ arp.h | 1 + ndp.c | 19 +++++++++++++++++++ ndp.h | 1 + tap.c | 16 ++++++++++++---- util.h | 1 + 6 files changed, 67 insertions(+), 4 deletions(-) diff --git a/arp.c b/arp.c index 44677ad..c1bd63b 100644 --- a/arp.c +++ b/arp.c @@ -112,3 +112,36 @@ int arp(const struct ctx *c, struct iov_tail *data)
return 1; } + +/** + * arp_send_init_req() - Send initial ARP request to retrieve guest MAC address + * @c: Execution context + */ +void arp_send_init_req(const struct ctx *c) +{ + struct { + struct ethhdr eh; + struct arphdr ah; + struct arpmsg am; + } __attribute__((__packed__)) req; + + /* Ethernet header */ + req.eh.h_proto = htons(ETH_P_ARP); + memcpy(req.eh.h_dest, MAC_BROADCAST, sizeof(req.eh.h_dest)); + memcpy(req.eh.h_source, c->our_tap_mac, sizeof(req.eh.h_source)); + + /* ARP header */ + req.ah.ar_op = htons(ARPOP_REQUEST); + req.ah.ar_hrd = htons(ARPHRD_ETHER); + req.ah.ar_pro = htons(ETH_P_IP); + req.ah.ar_hln = ETH_ALEN; + req.ah.ar_pln = 4; + + /* ARP message */ + memcpy(req.am.sha, c->our_tap_mac, sizeof(req.am.sha)); + memcpy(req.am.sip, &c->ip4.our_tap_addr, sizeof(req.am.sip)); + memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha)); + memcpy(req.am.tip, &c->ip4.addr, sizeof(req.am.tip)); + + tap_send_single(c, &req, sizeof(req)); +} diff --git a/arp.h b/arp.h index 86bcbf8..d5ad0e1 100644 --- a/arp.h +++ b/arp.h @@ -21,5 +21,6 @@ struct arpmsg { } __attribute__((__packed__));
int arp(const struct ctx *c, struct iov_tail *data); +void arp_send_init_req(const struct ctx *c);
#endif /* ARP_H */ diff --git a/ndp.c b/ndp.c index eb090cd..b3bdedb 100644 --- a/ndp.c +++ b/ndp.c @@ -438,3 +438,22 @@ void ndp_timer(const struct ctx *c, const struct timespec *now) first: next_ra = now->tv_sec + interval; } + +/** + * ndp_send_init_req() - Send initial NDP NS to retrieve guest MAC address + * @c: Execution context + */ +void ndp_send_init_req(const struct ctx *c) +{ + struct ndp_ns ns = { + .ih = { + .icmp6_type = NS, + .icmp6_code = 0, + .icmp6_router = 0, /* Reserved */ + .icmp6_solicited = 0, /* Reserved */ + .icmp6_override = 0, /* Reserved */ + }, + .target_addr = c->ip6.addr + }; + ndp_send(c, &c->ip6.addr, &ns, sizeof(ns)); +} diff --git a/ndp.h b/ndp.h index b1dd5e8..781ea86 100644 --- a/ndp.h +++ b/ndp.h @@ -11,5 +11,6 @@ struct icmp6hdr; int ndp(const struct ctx *c, const struct in6_addr *saddr, struct iov_tail *data); void ndp_timer(const struct ctx *c, const struct timespec *now); +void ndp_send_init_req(const struct ctx *c);
#endif /* NDP_H */ diff --git a/tap.c b/tap.c index 7ba6399..ea61eae 100644 --- a/tap.c +++ b/tap.c @@ -1088,6 +1088,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data, { struct ethhdr eh_storage; const struct ethhdr *eh; + char bufmac[ETH_ADDRSTRLEN];
pcap_iov(data->iov, data->cnt, data->off);
@@ -1097,6 +1098,7 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data,
if (memcmp(c->guest_mac, eh->h_source, ETH_ALEN)) { memcpy(c->guest_mac, eh->h_source, ETH_ALEN); + info("Guest MAC address: %s", eth_ntop(c->guest_mac, bufmac, sizeof(bufmac))); proto_update_l2_buf(c->guest_mac, NULL); }
@@ -1355,6 +1357,11 @@ static void tap_start_connection(const struct ctx *c) ev.events = EPOLLIN | EPOLLRDHUP; ev.data.u64 = ref.u64; epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); + + info("Sending initial ARP and NDP request to retrieve" + " guest MAC address after reconnect"); + arp_send_init_req(c);
This should be conditional to whether we have IPv4 support enabled or not, and the check would need to be analogous to the one from tap4_handler() (sorry, it's a bit hidden): if (!c->ifi4 || ...) return ...;
+ ndp_send_init_req(c);
And this should only happen if IPv6 is enabled, see tap6_handler(): if (!c->ifi6 || ...) return ...; and also, arguably, iff NDP support is not disabled by means of --no-ndp (c->no_ndp). Strictly speaking, we could send this anyway and still fit the current documentation of --no-ndp: --no-ndp Disable NDP responses. NDP messages coming from guest or target namespace will be ignored. but this would make --no-ndp a misnomer, and given that we'll ignore neighbour advertisements, it makes no sense to send a solicitation anyway. All in all, I would just not do this on c->no_ndp. If you can think of a terse way of updating the man page to reflect this, that would be appreciated, but I think it's also fine like it is. By the way, we'll also ignore responses on --no-icmp. I just realised that the man page is currently inaccurate, because it refers to echo messages only, but in tap6_handler() we have: if (proto == IPPROTO_ICMPV6) { ... if (c->no_icmp) continue; ... if (ndp(c, saddr, &ndp_data)) continue; ... } So I think we should update the man page to mention that --no-icmp means no ICMP and no ICMPv6, and also skip sending the NDP solicitation in that case. Or update the code to reflect what the man page says, but then the option could be considered a misnomer, so I wouldn't go this way.
}
/** @@ -1503,11 +1510,12 @@ void tap_backend_init(struct ctx *c) case MODE_PASST: tap_sock_unix_init(c);
- /* In passt mode, we don't know the guest's MAC address until it - * sends us packets. Use the broadcast address so that our - * first packets will reach it. + /* In passt mode, we don't know the guest's MAC address until + * it sends us packets (e.g. responds to our initial ARP or
I don't think the response is an example, so I wouldn't use "e.g." here, rather "i.e." / "that is", if that's the expected behaviour.
+ * NDP request). Until then, use the broadcast address so + * that our first packets will have a chance to reach it. */ - memset(&c->guest_mac, 0xff, sizeof(c->guest_mac)); + memcpy(&c->guest_mac, MAC_BROADCAST, sizeof(c->guest_mac)); break; }
diff --git a/util.h b/util.h index 2a8c38f..3719f0c 100644 --- a/util.h +++ b/util.h @@ -97,6 +97,7 @@ void abort_with_msg(const char *fmt, ...) #define FD_PROTO(x, proto) \ (IN_INTERVAL(c->proto.fd_min, c->proto.fd_max, (x)))
+#define MAC_BROADCAST ((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
This can be easily wrapped to fit 80 columns without otherwise affecting readability, see examples just above and below: #define MAC_BROADCAST \ ((uint8_t [ETH_ALEN]){ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff })
#define MAC_ZERO ((uint8_t [ETH_ALEN]){ 0 }) #define MAC_IS_ZERO(addr) (!memcmp((addr), MAC_ZERO, ETH_ALEN))
The rest looks good to me! -- Stefano