On Wed, Oct 08, 2025 at 12:42:37AM +0200, Stefano Brivio wrote:
On Tue, 7 Oct 2025 10:31:11 +1100 David Gibson
wrote: On Tue, Oct 07, 2025 at 12:32:19AM +0200, Stefano Brivio wrote:
On Fri, 3 Oct 2025 13:19:17 +1000 David Gibson
wrote: On Thu, Oct 02, 2025 at 01:58:41PM +0200, Stefano Brivio wrote:
On Thu, 2 Oct 2025 12:41:08 +1000 David Gibson
wrote: On Thu, Oct 02, 2025 at 02:06:43AM +0200, Stefano Brivio wrote: > If we reach end-of-file on a socket (or get EPOLLRDHUP / EPOLLHUP) and > send a FIN segment to the guest / container acknowledging a sequence > number that's behind what we received so far, we won't have any > further trigger to send an updated ACK segment, as we are now > switching the epoll socket monitoring to edge-triggered mode. > > To avoid this situation, in tcp_update_seqack_wnd(), we set the next > acknowledgement sequence to the current observed sequence, regardless > of what was acknowledged socket-side.
To double check my understanding: things should work if we always acknowledged everything we've received. Acknowledging only what the peer has acked is a refinement to give the guest a view that's closer to what it would be end-to-end with the peer (which might improve the operation of flow control).
Right.
We can't use that refined mechanism when the socket is closing (amongst other cases), because while we can get the peer acked bytes from TCP_INFO, we can't get events when that changes, so we have no mechanism to provide updates to the guest at the right time. So we fall back to the simpler method.
Is that correct?
Also correct, yes. If you have a better idea to summarise this in the comment in tcp_buf_data_from_sock() let me know.
Hm, I might. Or actually a way to reorganise the code that I think will be a bit clearer and probably allow a clearer comment too.
I would keep reworks for a later moment. Right now, it's already taking me long enough to find a moment to investigate these issues, write these fixes, and test them.
I mean... the change I'm proposing reduces lines of code (excepting the big new comment), makes it easier to reason about and is localised to the immediately surrounding code. But whatever, I don't particularly care about the order we do things.
Sure, I don't think that other patch is particularly complicated or might ever become problematic at all, but, from your earlier comment ("reorganise the code", as I mentioned the comment in tcp_buf_data_from_sock()), I thought you wanted to rework this particular code in tcp_buf_data_from_sock() at the same time.
Yeah, I guess "reorganise" gives an impression of a wider ranging change than it actually was.
I guess it's not the case judging from your most recent reply, though.
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson