On Fri, 11 Apr 2025 14:54:50 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:Hi Stefano, When debugging the splice EINTR bug I fixed the other day, I found the whole tcp_splice_sock_handler() pretty confusing to follow. So, I was working on some cleanups. But then I noticed something more specifically odd here. We've discussed the use of SO_RCVLOWAT previously. AIUI, you found it essential to achieve reasonable throughput and load for spliced connections.From my tests back then (never on what I ended up committing, it seems) it wasn't needed in general, it helped only with bulk transfers that never feel the pipe for some reason. With iperf3, I needed to play with parameters quite a bit to reproduce something like that. You would need (at least) to disable Nagle's algorithm (-N) and send small messages (say, -l 4k instead of -l 1M).I think we've agreed before that it's not entirely the right tool for the job; just the only one available. Except... as far as I can tell, it's never invoked. AFAICT the only place we enable the RCVLOWAT stuff is in a block under this if: if (!(conn->flags & lowat_set_flag) && readlen > (long)c->tcp.pipe_size / 10) { But... this occurs immediately after: if (readlen >= (long)c->tcp.pipe_size * 10 / 100) continue; .. which is a strictly more inclusive condition, so we'll never reach the RCVLOWAT block.Right, yes, I think we noticed a while ago a bit after trying to restore the functionality with 01b6a164d94f ("tcp_splice: A typo three years ago and SO_RCVLOWAT is gone"). That wasn't sufficient.To confirm, I tried putting an ASSERT(0) in that block, and didn't hit it with spliced iperf3 runs. Am I missing something?No, not really, and that error has been there forever, since I "added" (not really) the feature in 904b86ade7db ("tcp: Rework window handling, timers, add SO_RCVLOWAT and pools for sockets/pipes"). My intention was actually: if (read >= (long)c->tcp.pipe_size * 90 / 100) continue; or something like that, perhaps 50% even. The idea behind it was: if we already have good pipe utilisation, there's no need for SO_RCVLOWAT, and we should retry calling splice() right away. But at this point we should try again with iperf3 and smaller messages. Perhaps even limiting the throughput (-b) with multiple flows... -- Stefano