On Wed, Apr 24, 2024 at 08:30:44PM +0200, Stefano Brivio wrote:On Wed, 24 Apr 2024 10:48:05 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:That makes sense, but I think it would need a per-connection flag.On Tue, Apr 23, 2024 at 07:50:10PM +0200, Stefano Brivio wrote:By the way of which, this is not documented at this point -- a man page patch (linux-man and linux-api lists) would be nice.On Sat, 20 Apr 2024 15:19:19 -0400 Jon Maloy <jmaloy(a)redhat.com> wrote:[snip]Sort of, yes: we need to enable the SO_PEEK_OFF behaviour by setting it to 0, rather than the default -1.+ set_peek_offset(s, 0);Do we really need to initialise it to zero on a new connection? Extra system calls on this path matter for latency of connection establishment.We could lazily enable it, but we'd need either to a) do it later in the handshake (maybe when we set ESTABLISHED), but we'd need to be careful it is always set before the first MSG_PEEKI was actually thinking that we could set it only as we receive data (not every connection will receive data), and keep this out of the handshake (which we want to keep "faster", I think).And setting it as we mark a connection as ESTABLISHED should have the same effect on latency as setting it on a new connection -- that's not really lazy. So, actually:Good point.I just realised that folding the feature detection into this is a bit costlier than I thought. If we globally probe the feature we just need one bit per connection: is SO_PEEK_OFF set yet or not. If we tried to probe per-connection we'd need a tristate: haven't tried / SO_PEEK_OFF enabled / tried and failed. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonor b) keep track of whether it's set on a per-socket basis (this would have the advantage of robustness if we ever encountered a kernel that weirdly allows it for some but not all TCP sockets)....this could be done as we receive data in tcp_data_from_sock(), with a new flag in tcp_tap_conn::flags, to avoid adding latency to the handshake. It also looks more robust to me, and done/checked in a single place where we need it. We have just three bits left there which isn't great, but if we need to save one at a later point, we can drop this new flag easily.