On 2025-09-25 02:36, David Gibson wrote:
On Wed, Sep 24, 2025 at 06:18:52PM -0400, Jon Maloy wrote:
[...]
I experimented a bit with this. My test program is a simple UDP client-server pair, exchanging first 3 UDP messages client->server, followed by 3 messages server->client.
With the client on the guest, and server outside? How is the outside machine arranged - is it a physically separate host? A bridged VM or container on the same host? Something else?
It is a physically separate host.
First, I changed the main() loop a bit, so that netlink events are handled before all other events, if any. (Basically, I added an extra loop before the main loop, only handling netlink events, before moving on to the main loop (where netlink events had been excluded.) This should secure absolute priority of netlink events before any other events. As you will see below, this made no difference to the scenarios I describe.
Drat.
1: When starting the container, I notice that there is no subscription event in PASTA, even though I can see the entry for the remote host is present in the host's ARP table. There is never any event coming up even if I wait for 10+ minutes.
Huh.... do we need to do something to ensure we get events for existing entries in the host ARP table, not just ones that are added or updated after we're running?
It doesn't seem to be possible, but even if it were it wouldn't help us much if the entry isn't here, which is also a problematic case. See below.
2: The first UDP is attempted sent from the guest. An ARP request is sent to PASTA, and responded to with the 9a:9a: address.
Maybe we still need to explicitly ask for an ARP resolution when the guest ARPs.
I think so. If we limit this to ARP and NDP, this should be unproblematic.
3: The UDP, and two more UDPs, are sent via PASTA to the remote host. Those are responded to and sent back to the guest. 4: I now receive a neigbour event, and can update my cache, but since there is still no new ARP request from the guest, even if I wait for many minutes, he continues in the belief the old address is confirmed. 5: If I run the same test again after a few minutes, the guest *does* send out an ARP request a few seconds after the message exchange, and is now updated with the correct address.
- If i run this sequence in the opposite direction everything seems to work ok, at least if the ARP entry is already present on the local host.
- When I delete that ARP entry before running the sequence,
Delete it from the host ARP table, you mean?
Yes.
a neigbour event shows up after some seconds, but it can take up to a minute, at least.
Oof. I guess some delay is inevitable, but that's way longer than I would have expected.
If I run my sequence from the remote host before that happens, there will be an ARP request from the guest (for the response UDPs), responded to with the default tap mac, and it will remain like that for a long time, since the guest considers the mac address confirmed. It doesn't help much that a neigbour event shows up some seconds after the exchange.
In brief, the guest *will* be updated eventually, but depending on luck and timing it may take a long time, at least several minutes.
[...]
+ memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha)); + memcpy(req.am.tip, &ip, sizeof(req.am.tip));
So, I was trying to check if it made sense to use the same IP for both source and target here, and came across https://www.rfc-editor.org/rfc/rfc5227#section-3
Which suggests we should (counter intuitively) be using ARP requests, not ARP replies for announcements.
Instead of gratuitous ARP, you mean? I can try it.
It suggests that what's traditionally meant by "gratuitous ARP" is actually ARP requests, not responses as you might expect. There's some detailed reasoning there, I'd give it a read.
So will I. It sounds interesting. ///jon