On Thu, 28 May 2026 15:02:11 +1000
David Gibson
We set the OUT_WAIT flag if we stop forwarding due to EAGAIN, but there's still data in the pipe. That ensures we wake up when the output socket has room to drain the pipe into.
We clear the OUT_WAIT flag when we complete forwarding on an EPOLLOUT event, but that's not quite right. Even though it's called on an EPOLLOUT, tcp_splice_forward() could, in principle empty the pipe, but also read enough new data from the other side to fill it again. That would set OUT_WAIT internally, but it would be cleared after returning meaning we could miss a necessary wakeup.
The current logic in tcp_splice_sock_handler(): if (events & EPOLLOUT) { if (tcp_splice_forward(c, conn, !evsidei, now)) goto reset; conn_event(conn, ~OUT_WAIT(evsidei)); } if (events & EPOLLIN) { if (tcp_splice_forward(c, conn, evsidei, now)) goto reset; } would prevent the case you described, because if we read new data from the other side filling the pipe, we'll hit (events & EPOLLIN) and set OUT_WAIT again if needed. But there's a case this should actually fix, even though I've never seen it happening in practice: what if we *don't* read new data from the other side, and we can't empty the pipe in one EPOLLOUT shot anyway? I hadn't considered that before but if the receiver is slow enough that's probably possible.
The condition on whether we need write side wakeups is actually fairly simple: we need them if and only if we return to the main loop with data in the pipe. Maintain that in a single place - right after we exit the forwarding loop in tcp_splice_forward().
Signed-off-by: David Gibson
--- tcp_splice.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/tcp_splice.c b/tcp_splice.c index 42902684..5f412584 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -531,19 +531,22 @@ static int tcp_splice_forward(struct ctx *c, conn->pending[fromsidei] += readlen > 0 ? readlen : 0; conn->pending[fromsidei] -= written > 0 ? written : 0;
- if (written < 0) { - if (!conn->pending[fromsidei]) - break; - - conn_event(conn, OUT_WAIT(!fromsidei)); + if (written < 0) break; - }
if (conn->events & FIN_RCVD(fromsidei) && !conn->pending[fromsidei]) break; }
+ /* We need write-side wakeups if and only if we have data in the pipe to + * drain. + */ + if (conn->pending[fromsidei]) + conn_event(conn, OUT_WAIT(!fromsidei)); + else + conn_event(conn, ~OUT_WAIT(!fromsidei)); + if ((conn->events & FIN_RCVD(fromsidei)) && !(conn->events & FIN_SENT(!fromsidei)) && !conn->pending[fromsidei]) { @@ -606,7 +609,6 @@ void tcp_splice_sock_handler(struct ctx *c, union epoll_ref ref, if (events & EPOLLOUT) { if (tcp_splice_forward(c, conn, !evsidei, now)) goto reset; - conn_event(conn, ~OUT_WAIT(evsidei)); }
if (events & (EPOLLIN | EPOLLRDHUP)) {
-- Stefano