Re: [PATCH v12 3/9] arp/ndp: send ARP announcement / unsolicited NA when neigbour entry added

9 Oct 2025

On Thu, 9 Oct 2025 14:51:02 +1100
David Gibson  wrote:
...
On Wed, Oct 08, 2025 at 12:01:18PM +0200, Stefano Brivio wrote:
...
On Wed, 8 Oct 2025 11:27:32 +1100
David Gibson  wrote:
...
On Tue, Oct 07, 2025 at 12:10:22PM +0200, Stefano Brivio wrote:
...
On Fri, 3 Oct 2025 14:41:56 +1000
David Gibson  wrote:
...
On Thu, Oct 02, 2025 at 08:34:06PM -0400, Jon Maloy wrote:
...
ARP announcements and unsolicited NAs should be handled with caution
because of the risk of malignant users emitting them to disturb
network communication.
There is however one case we where we know it is legitimate
and safe for us to send out such messages: The one time we switch
from using ctx->own_tap_mac to a MAC address received via the
recently added neigbour subscription function. Later changes to
the MAC address of a host in an existing entry cannot be fully
trusted, so we abstain from doing it in such cases.
When sending this type of messages, we notice that the guest accepts
the update, but shortly later asks for a confirmation in the form of
a regular ARP/NS request. This is responded to with the new value,
and we have exactly the effect we wanted.
This commit adds this functionality.
Signed-off-by: Jon Maloy 
---
v10: -Made small changes based of feedback from David G.
v11: -Moved from 'Gratuitous ARP reply' model to 'ARP Announcement'
      model.
v12: -Excluding loopback and default GW addresses from the ARP/NA
      announcement to be sent to the guest
---
 arp.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 arp.h |  2 ++
 fwd.c | 16 ++++++++++++++++
 ndp.c | 10 ++++++++++
 ndp.h |  1 +
 5 files changed, 71 insertions(+)

diff --git a/arp.c b/arp.c
index ad088b1..b08780f 100644
--- a/arp.c
+++ b/arp.c
@@ -146,3 +146,45 @@ void arp_send_init_req(const struct ctx *c)
  debug("Sending initial ARP request for guest MAC address");
  tap_send_single(c, &req, sizeof(req));
 }
+
+/**
+ * arp_announce() - Send an ARP announcement for an IPv4 host
+ * @c:		Execution context
+ * @ip:	IPv4 address we announce as owned by @mac
+ * @mac:	MAC address to advertise for @ip
+ */
+void arp_announce(const struct ctx *c, struct in_addr *ip,
+		  const unsigned char *mac)
+{
+	char ip_str[INET_ADDRSTRLEN];
+	char mac_str[ETH_ADDRSTRLEN];
+	struct {
+		struct ethhdr eh;
+		struct arphdr ah;
+		struct arpmsg am;
+	} __attribute__((__packed__)) annc;
+
+	/* Ethernet header */
+	annc.eh.h_proto = htons(ETH_P_ARP);
+	memcpy(annc.eh.h_dest, MAC_BROADCAST, sizeof(annc.eh.h_dest));
+	memcpy(annc.eh.h_source, mac, sizeof(annc.eh.h_source));
+
+	/* ARP header */
+	annc.ah.ar_op = htons(ARPOP_REQUEST);
+	annc.ah.ar_hrd = htons(ARPHRD_ETHER);
+	annc.ah.ar_pro = htons(ETH_P_IP);
+	annc.ah.ar_hln = ETH_ALEN;
+	annc.ah.ar_pln = 4;
+
+	/* ARP message */
+	memcpy(annc.am.sha, mac, sizeof(annc.am.sha));
+	memcpy(annc.am.sip, ip, sizeof(annc.am.sip));
+	memcpy(annc.am.tha, MAC_BROADCAST, sizeof(annc.am.tha));
+	memcpy(annc.am.tip, ip, sizeof(annc.am.tip));
As noted in several earlier revisions, having sip == tip (but with
different mac addresses) looks odd.  Is that what the RFCs say to do
for ARP announcements?
...
+	inet_ntop(AF_INET, ip, ip_str, sizeof(ip_str));
+	eth_ntop(mac, mac_str, sizeof(mac_str));
+	debug("Announcing ARP for %s / %s", ip_str, mac_str);
+
+	tap_send_single(c, &annc, sizeof(annc));
+}
diff --git a/arp.h b/arp.h
index d5ad0e1..4862e90 100644
--- a/arp.h
+++ b/arp.h
@@ -22,5 +22,7 @@ struct arpmsg {
int arp(const struct ctx *c, struct iov_tail *data);
 void arp_send_init_req(const struct ctx *c);
+void arp_announce(const struct ctx *c, struct in_addr *ip,
+		  const unsigned char *mac);
#endif /* ARP_H */
diff --git a/fwd.c b/fwd.c
index c34bb1c..ade97c8 100644
--- a/fwd.c
+++ b/fwd.c
@@ -26,6 +26,8 @@
 #include "passt.h"
 #include "lineread.h"
 #include "flow_table.h"
+#include "arp.h"
+#include "ndp.h"
/* Empheral port range: values from RFC 6335 */
 static in_port_t fwd_ephemeral_min = (1 << 15) + (1 << 14);
@@ -140,6 +142,20 @@ void fwd_neigh_table_update(const struct ctx *c, const union inany_addr *addr,
memcpy(&e->addr, addr, sizeof(*addr));
  memcpy(e->mac, mac, ETH_ALEN);
+
+	if (inany_equals(addr, &inany_loopback4))
+		return;
+	if (inany_equals(addr, &inany_loopback6))
+		return;
Since you need these explicit checks anyway, there's not much point to
the dummy entries you created - you could exit on these addresses
before even looking up the table.
I guess those entries make sense if we can drop all these checks as a
result. I think we should be able to.
We couldn't in this version, because that might have allowed the
entries for loopback to be updated, which is certainly wrong.  But
it will all need re-examination after moving everything over to guest
side addresses which AIUI is the plan for the next spin.
Yes, I was talking about the next version. For context, when we first
discussed about the possibility of these entries with Jon, my
assumption was that the whole series used guest-side link-layer
addresses exclusively,
We did use guest-side link-layer addresses - host-side LL addresses
might not even exist.  The question is about whether we use guest side
or host side IP addresses to index the table.
Sorry, yes, I meant to write network and I wrote link-layer.
...
...
but that wasn't the case, hence (I think) the
current struggle. If we go in that direction, I hope it's possible.
Thinking a bit more closely, I don't think it is, for much the same
reason it wasn't in this draft.
According to the rules Jon and I thrashed out elsewhere in the thread,
there are certain guest side addresses that must be locked to use
our_tap_mac.  We're essentially shadowing something that might exist
on the host side, so we should use our MAC not the MAC of whatever is
shadowed.
Just pre-populating an entry won't do the trick, because it could be
overwritten if the right events occur for the shadowed host.
Right, sorry, I omitted another bit of context: I've been suggesting to
Jon that he'd introduce some kind of "permanent" or "administrative"
bit, and keep those entries at the beginning of the chain, exactly for
the reason you mention.

I can imagine we'll need those at some point if we ever want to offer
explicit link-layer address mapping in the future, and they're probably
convenient the day one can change map_guest_addr and map_host_loopback
at runtime.

We can also happily skip that for the moment, though, it's another
problem we can keep for later.
...
...
By the way, while they are probably more elegant because we can skip
explicit cases, they might be a bit more complicated to manage compared
to those explicit cases the day we get to change addresses and routes
dynamically using a netlink monitor, because at that point we might
need to remove some entries based on old addresses / default gateways.
But given that this is already complicated enough, we can keep that
problem for later, and just go with the simplest possible approach
(whatever it is) for the moment.
-- 
Stefano
-- 
Stefano