On Mon, 3 Feb 2025 11:46:13 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Fri, Jan 31, 2025 at 10:09:19AM +0100, Stefano Brivio wrote:While the explanation for the issue is what you gave as comment to 8/20 (I need to close() the socket from passt-repair), let me answer here: sure, I must close() it, and it was close()d by passt but not passt-repair.Fixed, finally. Some answers: On Fri, 31 Jan 2025 17:14:18 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Ok.On Fri, Jan 31, 2025 at 06:36:55AM +0100, Stefano Brivio wrote:It is.On Thu, 30 Jan 2025 09:32:36 +0100 Stefano Brivio <sbrivio(a)redhat.com> wrote: > I would like to quickly complete the whole flow first, because I think > we can inform design and implementation decisions much better at that > point So, there seems to be a problem with (testing?) this. I couldn't quite understand the root cause yet, and it doesn't happen with the reference source.c and target.c implementations I shared. Let's assume I have a connection in the source guest to 127.0.0.1:9091, from 127.0.0.1:56350. After the migration, in the target, I get: --- socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 79 setsockopt(79, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 bind(79, {sa_family=AF_INET, sin_port=htons(56350), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 sendmsg(72, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\1", iov_len=1}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[79]}], msg_controllen=24, msg_flags=0}, 0) = 1 recvfrom(72, "\1", 1, 0, NULL, NULL) = 1 setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [2], 4) = 0 setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [1788468535], 4) = 0 write(2, "77.6923: ", 977.6923: ) = 9 write(2, "Set send queue sequence for sock"..., 51Set send queue sequence for socket 79 to 1788468535) = 51 write(2, "\n", 1 ) = 1 setsockopt(79, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 setsockopt(79, SOL_TCP, TCP_QUEUE_SEQ, [115288604], 4) = 0 write(2, "77.6924: ", 977.6924: ) = 9 write(2, "Set receive queue sequence for s"..., 53Set receive queue sequence for socket 79 to 115288604) = 53 write(2, "\n", 1 ) = 1 connect(79, {sa_family=AF_INET, sin_port=htons(9091), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address) --- EADDRNOTAVAIL, according to the documentation, which seems to be consistent with a glance at the implementation (that is, I must be missing some issue in the kernel), should be returned on connect() if: EADDRNOTAVAIL (Internet domain sockets) The socket referred to by sockfd had not previously been bound to an address and, upon attempting to bind it to an ephemeral port, it was determined that all port numbers in the ephemeral port range are currently in use. See the discussion of /proc/sys/net/ipv4/ip_local_port_range in ip(7). but well, of course it was bound. To a port, indeed, not a full address, that is, any (0.0.0.0) and address port, but I think for the purposes of this description that bind() call is enough.So, I was wondering if binding to 0.0.0.0 is sufficient for a repaired socket.Usually, of course, that 0.0.0.0 would be resolved to a real address at connect() time. But TCP_REPAIR's version of connect() bypasses a bunch of the usual connect logic, so maybe we need an explicit address here.No need.I'm still confused by the specific sequence of events that's causing the problem. If a socket is closed with close(2) it should no longer exist, so I don't see how you could even attempt to do anything with it. Do you mean that the socket is shutdown(RD|WR)? Or that it's been closed by passt, but not by passt-repair? Or the other way around? I'd kind of assume that you _must_ close the socket while still in repair mode, since we want it to go away on the source without attempting to FIN or RST or anything....but that doesn't explain the difference between passt and your test implementation.The difference that actually matters is that the test implementation terminates, and that has the equivalent effect of switching off repair mode for the closed sockets, which frees up all the associated context, including the port. Usually, there are no valid operations on closed sockets (not even close()). This is the first exception I ever met: you can set TCP_REPAIR_OFF.Nah, most likely not. The EBADF on a close()d socket is a bit questionable (it should be EINVAL? Or a -1 socket in the recipient?), but other than that, the explanation is that passing that closed socket caused EOF in passt-repair, and passt-repair would quit, solving the issue. -- StefanoBut there's a catch: you can't pass a closed socket in repair mode via SCM_RIGHTS (well, I'm fairly sure nobody approached this level of insanity before): you get EBADF (which is an understatement). And there's another catch: if you actually try to do that, even if it fails, that has the same effect of clearing the socket entirely: you free up the port.!?! this is even more baffling. Passing what's now an unrelated, unassigned integer as an fd is having some effect on a socket that was around!? If so that's a horrifying kernel bug.