Re: [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure

30 Jan 2025

      On Thu, 30 Jan 2025 18:38:22 +1100
David Gibson  wrote:
...
Right, but in the present draft you pay that cost whether or not
you're actually using the flows.  Unfortunately a busy server with
heaps of active connections is exactly the case that's likely to be
most sensitve to additional downtime, but there's not really any
getting around that.  A machine with a lot of state will need either
high downtime or high migration bandwidth.
It's... sixteen megabytes. A KubeVirt node is only allowed to perform up
to _four_ migrations in parallel, and that's our main use case at the
moment. "High downtime" is kind of relative.
...
But, I'm really hoping we can move relatively quickly to a model where
a guest with only a handful of connections _doesn't_ have to pay that
128k flow cost - and can consequently migrate ok even with quite
constrained migration bandwidth.  In that scenario the size of the
header could become significant.
I think the biggest cost of the full flow table transfer is rather code
that's a bit quicker to write (I just managed to properly set sequences
on the target, connections don't quite "flow" yet) but relatively high
maintenance (as you mentioned, we need to be careful about every single
field) and easy to break.

I would like to quickly complete the whole flow first, because I think
we can inform design and implementation decisions much better at that
point, and we can be sure it's feasible, but I'm not particularly keen
to merge this patch like it is, if we can switch it relatively swiftly
to an implementation where we model a smaller fixed-endian structure
with just the stuff we need.

And again, to be a bit more sure of which stuff we need in it, the full
flow is useful to have implemented.

Actually the biggest complications I see in switching to that approach,
from the current point, are that we need to, I guess:

1. model arrays (not really complicated by itself)

2. have a temporary structure where we store flows instead of using the
   flow table directly (meaning that the "data model" needs to logically
   decouple source and destination of the copy)

3. batch stuff to some extent. We'll call socket() and connect() once
   for each socket anyway, obviously, but sending one message to the
   TCP_REPAIR helper for each socket looks like a rather substantial
   and avoidable overhead
...
...
...
It's both easier to do
and a bigger win in most cases.  That would dramatically reduce the
size sent here.
Yep, feel free.
It's on my queue for the next few days.
To me this part actually looks like the biggest priority after/while
getting the whole thing to work, because we can start right with a 'v1'
which looks more sustainable.

And I would just get stuff working on x86_64 in that case, without even
implementing conversions and endianness switches etc.

-- 
Stefano

Re: [PATCH 6/7] Introduce facilities for guest migration on top of vhost-user infrastructure

Stefano Brivio