On Fri, Apr 26, 2024 at 07:58:32AM +0200, Stefano Brivio wrote:
On Fri, 26 Apr 2024 13:27:11 +1000
David Gibson <david(a)gibson.dropbear.id.au> wrote:
On Wed, Apr 24, 2024 at 08:30:44PM +0200, Stefano
Brivio wrote:
On Wed, 24 Apr 2024 10:48:05 +1000
David Gibson <david(a)gibson.dropbear.id.au> wrote:
On Tue, Apr 23, 2024 at 07:50:10PM +0200, Stefano
Brivio wrote:
> On Sat, 20 Apr 2024 15:19:19 -0400
> Jon Maloy <jmaloy(a)redhat.com> wrote:
[snip]
> > + set_peek_offset(s, 0);
>
> Do we really need to initialise it to zero on a new connection? Extra
> system calls on this path matter for latency of connection
> establishment.
Sort of, yes: we need to enable the SO_PEEK_OFF behaviour by setting
it to 0, rather than the default -1.
By the way of which, this is not documented at this point -- a man page
patch (linux-man and linux-api lists) would be nice.
We could lazily enable it, but
we'd need either to a) do it later in the handshake (maybe when we set
ESTABLISHED), but we'd need to be careful it is always set before the
first MSG_PEEK
I was actually thinking that we could set it only as we receive data
(not every connection will receive data), and keep this out of the
handshake (which we want to keep "faster", I think).
That makes sense, but I think it would need a per-connection flag.
Definitely.
And
setting it as we mark a connection as ESTABLISHED should have the
same effect on latency as setting it on a new connection -- that's not
really lazy. So, actually:
Good point.
or b)
keep track of whether it's set on a per-socket
basis (this would have the advantage of robustness if we ever
encountered a kernel that weirdly allows it for some but not all TCP
sockets).
...this could be done as we receive data in tcp_data_from_sock(), with
a new flag in tcp_tap_conn::flags, to avoid adding latency to the
handshake. It also looks more robust to me, and done/checked in a
single place where we need it.
We have just three bits left there which isn't great, but if we need to
save one at a later point, we can drop this new flag easily.
I just realised that folding the feature detection into this is a bit
costlier than I thought. If we globally probe the feature we just
need one bit per connection: is SO_PEEK_OFF set yet or not. If we
tried to probe per-connection we'd need a tristate: haven't tried /
SO_PEEK_OFF enabled / tried and failed.
I forgot to mention this part: what I wanted to propose was actually
still a global probe, so that we don't waste one system call per
connection on kernels not supporting this (a substantial use case for a
couple of years from now?), which probably outweighs the advantage of
the weird, purely theoretical kernel not supporting the feature for
some sockets only.
And then something like PEEK_OFFSET_SET
(SO_PEEK_OFF_SET sounds awkward
to me) on top. Another advantage is avoiding the tristate you described.
Right, having thought it through I agree this is a better approach.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson