On Wed, May 20, 2026 at 10:28:36PM +0200, Stefano Brivio wrote:
On Wed, 20 May 2026 23:08:47 +1000 David Gibson
wrote: tcp_splice_sock_handler() has an optimised path for the common case where the amount we splice(2) into the pipe is exactly the same as the amount we splice(2) out again. If the pipe is empty at that point, we stop forwarding until we get another epoll event.
However, via a subtle chain of events, this can cause a bug for a half-closed connection. Suppose the connection is already half-closed in the other direction - that is, we've already called shutdown(SHUT_WR) on the socket for which we're getting the event. In this event we're getting the last batch of data in the other direction, and also a FIN. This can result in EPOLLIN, EPOLLRDHUP and EPOLLHUP events simultaneously.
We read the last data from the socket and successfully splice it to the other side. Since there is no data in the pipe, we exit the forwarding loop. However, because we did read data, we don't set the eof flag.
Because we don't set eof, we don't (yet) propagate the FIN to the other side, or set FIN_SENT_(!fromsidei). Therefore we don't (yet) recognize this as a clean termination and set the CLOSING flag. We would correct this when we get our next event, however before we can do so we process the EPOLLHUP event. Because we haven't recognized this as a clean close we assume it is an abrupt close and send an RST to the other side.
To avoid this, don't stop attempting to forward data on this path. Continue for at least one more loop. If we're at EOF, we'll recognize it on the next splice(2). If not it gives us an opportunity to forward more data without returning to the mail epoll loop.
Oops. The fix looks correct to me, but I wonder: is it clear to you why the issue only started occurring in this release? This code had "always" been there.
Because we didn't used to force resets on abnormal connection terminations, so it still worked by accident.
I see a few possible directions but I'm not quite sure. Not that important anyway, if you could reproduce the issue and this fixes it.
Ah, actually, I do still need to test with the original reproducer. It fixes it for my reproducer which I'm maybe 90% confident is exercising the same bug.
Just one nit:
Link: https://bugs.passt.top/show_bug.cgi?id=202 Signed-off-by: David Gibson
Reported-by: Paul Holzinger
Good point, fixed.
--- tcp_splice.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tcp_splice.c b/tcp_splice.c index 1359d6b8..34ffea73 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -605,7 +605,7 @@ retry: } }
- break; + continue; }
conn->read[fromsidei] += readlen > 0 ? readlen : 0;
-- Stefano
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson