Starting from commit 9178a9e3462d ("tcp: Always send an ACK segment once the handshake is completed"), we always send an ACK segment, without any payload, to complete the three-way handshake while establishing a connection started from a socket. We queue that segment after checking if we already have data to send to the tap, which means that its sequence number is higher than any segment with data we're sending in the same iteration, if any data is available on the socket. However, in tcp_defer_handler(), we first flush "flags" buffers, that is, we send out segments without any data first, and then segments with data, which means that our "empty" ACK is sent before the ACK segment with data (if any), which has a lower sequence number. This appears to be harmless as the guest or container will generally reorder segments, but it looks rather weird and we can't exclude it's actually causing problems. Queue the empty ACK first, so that it gets a lower sequence number, before checking for any data from the socket. Reported-by: David Gibson <david(a)gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- tcp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tcp.c b/tcp.c index 9617b7a..b2155ab 100644 --- a/tcp.c +++ b/tcp.c @@ -1957,11 +1957,12 @@ static void tcp_conn_from_sock_finish(const struct ctx *c, return; } + tcp_send_flag(c, conn, ACK); + /* The client might have sent data already, which we didn't * dequeue waiting for SYN,ACK from tap -- check now. */ tcp_data_from_sock(c, conn); - tcp_send_flag(c, conn, ACK); } /** -- 2.43.0
On Tue, Oct 15, 2024 at 12:38:20AM +0200, Stefano Brivio wrote:Starting from commit 9178a9e3462d ("tcp: Always send an ACK segment once the handshake is completed"), we always send an ACK segment, without any payload, to complete the three-way handshake while establishing a connection started from a socket. We queue that segment after checking if we already have data to send to the tap, which means that its sequence number is higher than any segment with data we're sending in the same iteration, if any data is available on the socket. However, in tcp_defer_handler(), we first flush "flags" buffers, that is, we send out segments without any data first, and then segments with data, which means that our "empty" ACK is sent before the ACK segment with data (if any), which has a lower sequence number. This appears to be harmless as the guest or container will generally reorder segments, but it looks rather weird and we can't exclude it's actually causing problems. Queue the empty ACK first, so that it gets a lower sequence number, before checking for any data from the socket. Reported-by: David Gibson <david(a)gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- tcp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tcp.c b/tcp.c index 9617b7a..b2155ab 100644 --- a/tcp.c +++ b/tcp.c @@ -1957,11 +1957,12 @@ static void tcp_conn_from_sock_finish(const struct ctx *c, return; } + tcp_send_flag(c, conn, ACK); + /* The client might have sent data already, which we didn't * dequeue waiting for SYN,ACK from tap -- check now. */ tcp_data_from_sock(c, conn); - tcp_send_flag(c, conn, ACK); } /**-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson