Re: [PATCH 4/7] udp: Separate bound sockets from flags

28 Feb 2024

      On Wed, Feb 28, 2024 at 01:30:05PM +1100, David Gibson wrote:
...
On Tue, Feb 27, 2024 at 12:56:59PM +0100, Stefano Brivio wrote:
...
On Thu, 22 Feb 2024 10:21:12 +1100
David Gibson  wrote:
[snip]
...
...
I'm having a hard time reviewing the rest because of this, I think.
Earlier, UDP_ACT_TAP meant that the activity timestamp was observed on
a non-spliced port.
Ah!  I'd read those as "ACTions we needed to take" in the expiry
timers, rather than tying them to specific activity.  It amounts to
the same thing pre-patch, but it makes more sense to me with that new
lens.
...
Now, we have UDP_ACT_FLAGS which seems to mean that activity was seen
in the direction of (to) the "tap" side, and UDP_ACT_TAP_SOCK meaning
that activity was seen in the direction of (to) the socket.
But that's somehow the opposite of UDP_ACT_SPLICE_NS and
UDP_ACT_SPLICE_INIT -- even though they don't actually indicate
activity, rather the fact that activity timestamps refer to sockets that
originated in (*from*) the target namespace, or in the initial
namespace.
Anyway, I'm not quite sure: why do we need two separate flags for "tap"
(from/to) activity?
We don't, this is an ugly side effect.  The problem is that in
scanning for expiries we only have a single port number.  But the
socket is associated with a particular forwarding port, while the
flags are associated with a particular endpoint port.  Without flow
tracking, we don't have a way to match one to the other.
The compromise here is - or is supposed to be - that we have
independent timestamps for the socket and the flags.  The two
timestamps associated with a single flow are supposed to be updated at
basically the same time - looks like I missed some things to do that
as well.
...
I mean, as long as we observe activity on a non-spliced flow, we'll
update the related timestamp, and we make sure the activity flag is
set. Can't it simply be UDP_ACT_TAP?
Urgh.  So it seems like every time I re-examine this, I find more
pre-existing bugs.  So, part of what this patch was attempting to fix
originally is that previously, although we updated the timestamp host->guest
packets, we updated the wrong one for expiring the socket (based on
endpoint rather than forwarding port).

In trying to fix it up, I additionally realised, we currently don't
update the timestamp for host to guest packets at all, except in the
when remapping the src to gateway address, which as noted previously
itself doesn't honour the no_map_gw flag.

*Then* I realised that if we do update the correct timestamp and ACT
flag, we could now set an ACT flag for a socket created by -u, and
therefore cause a port that should remain open to be expired.

I don't think this any longer counts as a "small" fix, so I will
remove it from this series and tackle it elsewhere.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson