[PATCH 0/6] More migration improvements
Here's another batch of migration changes. The first 5 patches are fixups for fairly minor issues in the existing draft patches. It probably makes sense to fold them into the existing patches. The last one changes the structure of the migration callbacks / sections with a new scheme that's simpler and more flexible. Specifically it will allow callbacks which "stream" data in a format designed for migration rather than being identical to the in memory structure. David Gibson (6): vhost-user: Change different vhost-user messages to trace() level migrate, flow: Abort migration on repair_flush() failure migrate: Clearer debug message in migrate_request() migrate: Handle sending header section from data sections util: read_remainder should take const pointer to iovec migrate: Make migration handlers simpler and more flexible flow.c | 17 +++- flow.h | 6 +- migrate.c | 277 ++++++++++++++++++--------------------------------- migrate.h | 54 ++++++---- util.c | 2 +- util.h | 2 +- vhost_user.c | 8 +- vu_common.c | 6 +- 8 files changed, 155 insertions(+), 217 deletions(-) -- 2.48.1
A previous patch changed the debug() calls in vu_control_handler() to
trace() to reduce noise when debugging migration. However those messages
are infrequent, and pretty useful when debugging migration (I used them to
discover I needed a different qemu version).
At the same time we have potentially per-packet debug() messages in
vu_kick_cb() and vu_send_single(). These get much more in the way since
they occur asynchronously with the migration operation. As a rule, per
packet messages should be trace() level anyway.
[This is a candidate to fold into the earlier patch]
Signed-off-by: David Gibson
On Mon, 3 Feb 2025 20:26:10 +1100
David Gibson
A previous patch changed the debug() calls in vu_control_handler() to trace() to reduce noise when debugging migration. However those messages are infrequent, and pretty useful when debugging migration (I used them to discover I needed a different qemu version).
At the same time we have potentially per-packet debug() messages in vu_kick_cb() and vu_send_single(). These get much more in the way since they occur asynchronously with the migration operation. As a rule, per packet messages should be trace() level anyway.
[This is a candidate to fold into the earlier patch]
Signed-off-by: David Gibson
Folded into existing patch and applied. -- Stefano
If we're unable to set our sockets into repair mode, we won't be able to
complete our migration. However flow_migrate_source_pre() currently
ignores errors from repair_flush().
Propagate the error instead. Since we're not currently able to properly
roll back a partial migration setup this fails rather messily, but better
than carrying on as if nothing was wrong.
[This is a candidate to fold into the earlier patch]
Signed-off-by: David Gibson
The message confusingly gives the *previous* value of c->device_state_fd as
the fd, not the newly assigned one.
[This is a candidate to fold into the earlier patch]
Signed-off-by: David Gibson
Currently the global state header is included in the list of data sections
to send for migration. This makes for an asymmetry between the source and
target sides: for the source, the header is sent after the 'pre' handlers
along with all the rest of the data. For the target side, the header must
be read first (before the 'pre' handlers), so that we know the version
which determines what all the rest of the data will be.
Change this so that the header is handled explicitly on both the source and
target side. This will make some future changes simpler as well.
Signed-off-by: David Gibson
read_remainder() takes a struct iovec * to describe where it will read its
data, unlike write_remainder() which takes a const struct iovec *. At
first this seems like it makes sense, since read_remainder() will alter
data within the iovec.
However, what it actually alters is data within the buffers described by
the iovec, not the iovec entries itself. So, like write it should take
a const struct iovec *.
[This is a candidate for folding into the earlier patch]
Signed-off-by: David Gibson
On Mon, 3 Feb 2025 20:26:14 +1100
David Gibson
read_remainder() takes a struct iovec * to describe where it will read its data, unlike write_remainder() which takes a const struct iovec *. At first this seems like it makes sense, since read_remainder() will alter data within the iovec.
However, what it actually alters is data within the buffers described by the iovec, not the iovec entries itself. So, like write it should take a const struct iovec *.
[This is a candidate for folding into the earlier patch]
Folded into existing patch and applied. -- Stefano
Currently the structure of the migration callbacks is quite rigid. There
are separate tables for pre and post handlers, for source and target.
Those only prepare or fix up with no reading or writing of the stream.
The actual reading and writing is done with an iovec of buffers to
transfer. That can't handle any variably sized structures, nor sending
state which is only obtained during migration rather than being tracked
usually.
Replace both the handlers and the sections with an ordered list of
migration "stages". Each stage has a callback for both source and target
side doing whatever is necessary - these can be NULL, for example for
preparation steps that have no equivalent on the other side. Things that
are just buffers to be transferred are handled with a macro generating a
suitable stage entry.
Signed-off-by: David Gibson
On Mon, 3 Feb 2025 20:26:15 +1100
David Gibson
+/* Stages for version 1 */ +static const struct migrate_stage stages_v1[] = { + { + .name = "flow pre", + .source = flow_migrate_source_pre, + .target = NULL, + }, + DATA_STAGE(flow_first_free), + DATA_STAGE(flowtab), + DATA_STAGE(flow_hashtab), + DATA_STAGE(g_hash_secret),
...so this one for g_hash_secret (which is the abomination I wanted to drop) would eventually turn into a function? It looks neat, I'm just not sure if we'll really have "data stages" after I change the approach to only transfer the data we need as you suggested. Is that part of your pending queue by the way, or should I go ahead with it?
[...]
/** - * struct migrate_data - Data sections for given source version - * @v: Source version this applies to, host order - * @sections: Array of data sections, NULL-terminated + * migrate_cb_t - Callback function to implement one migration stage */ -struct migrate_data { - uint32_t v; - struct iovec *sections; -}; +typedef int (*migrate_cb_t)(struct ctx *c, struct migrate_meta *m, + const struct migrate_stage *stage, int fd);
typedef is really annoying, we already have a flow_sidx which kind of made sense, but this has really no advantage over something like: struct migrate_handler { int (*fn)(struct ctx *c, struct migrate_meta *m); }; -- Stefano
On Mon, Feb 03, 2025 at 11:20:03AM +0100, Stefano Brivio wrote:
On Mon, 3 Feb 2025 20:26:15 +1100 David Gibson
wrote: +/* Stages for version 1 */ +static const struct migrate_stage stages_v1[] = { + { + .name = "flow pre", + .source = flow_migrate_source_pre, + .target = NULL, + }, + DATA_STAGE(flow_first_free), + DATA_STAGE(flowtab), + DATA_STAGE(flow_hashtab), + DATA_STAGE(g_hash_secret),
...so this one for g_hash_secret (which is the abomination I wanted to drop) would eventually turn into a function?
Yes.
It looks neat, I'm just not sure if we'll really have "data stages" after I change the approach to only transfer the data we need as you suggested.
I agree, that may well be the case, but we can just drop the macro and helepr functions then.
Is that part of your pending queue by the way, or should I go ahead with it?
Is which part of my pending queue? g_hash_secret as a function? No, I've thought of it, but I haven't written it yet.
[...]
/** - * struct migrate_data - Data sections for given source version - * @v: Source version this applies to, host order - * @sections: Array of data sections, NULL-terminated + * migrate_cb_t - Callback function to implement one migration stage */ -struct migrate_data { - uint32_t v; - struct iovec *sections; -}; +typedef int (*migrate_cb_t)(struct ctx *c, struct migrate_meta *m, + const struct migrate_stage *stage, int fd);
typedef is really annoying, we already have a flow_sidx which kind of made sense, but this has really no advantage over something like:
Eh, I guess. I find the extra . or -> a little annoying, but we can do this if you'd prefer.
struct migrate_handler { int (*fn)(struct ctx *c, struct migrate_meta *m); };
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Tue, 4 Feb 2025 10:55:28 +1100
David Gibson
On Mon, Feb 03, 2025 at 11:20:03AM +0100, Stefano Brivio wrote:
On Mon, 3 Feb 2025 20:26:15 +1100 David Gibson
wrote: +/* Stages for version 1 */ +static const struct migrate_stage stages_v1[] = { + { + .name = "flow pre", + .source = flow_migrate_source_pre, + .target = NULL, + }, + DATA_STAGE(flow_first_free), + DATA_STAGE(flowtab), + DATA_STAGE(flow_hashtab), + DATA_STAGE(g_hash_secret),
...so this one for g_hash_secret (which is the abomination I wanted to drop) would eventually turn into a function?
Yes.
It looks neat, I'm just not sure if we'll really have "data stages" after I change the approach to only transfer the data we need as you suggested.
I agree, that may well be the case, but we can just drop the macro and helepr functions then.
Is that part of your pending queue by the way, or should I go ahead with it?
Is which part of my pending queue? g_hash_secret as a function? No, I've thought of it, but I haven't written it yet.
No, I meant: "transfer the data we need".
[...]
/** - * struct migrate_data - Data sections for given source version - * @v: Source version this applies to, host order - * @sections: Array of data sections, NULL-terminated + * migrate_cb_t - Callback function to implement one migration stage */ -struct migrate_data { - uint32_t v; - struct iovec *sections; -}; +typedef int (*migrate_cb_t)(struct ctx *c, struct migrate_meta *m, + const struct migrate_stage *stage, int fd);
typedef is really annoying, we already have a flow_sidx which kind of made sense, but this has really no advantage over something like:
Eh, I guess. I find the extra . or -> a little annoying, but we can do this if you'd prefer.
Well, I find not finding things because of the typedef quite annoying. Actually, we don't even need a struct if you prefer (I actually do, but I know not everybody finds the syntax for function pointers convenient). I'm re-posting this squashed onto previous patches but it would be nice if you could re-post a version of the whole thing without the typedef. -- Stefano
On Tue, Feb 04, 2025 at 01:12:09AM +0100, Stefano Brivio wrote:
On Tue, 4 Feb 2025 10:55:28 +1100 David Gibson
wrote: On Mon, Feb 03, 2025 at 11:20:03AM +0100, Stefano Brivio wrote:
On Mon, 3 Feb 2025 20:26:15 +1100 David Gibson
wrote: +/* Stages for version 1 */ +static const struct migrate_stage stages_v1[] = { + { + .name = "flow pre", + .source = flow_migrate_source_pre, + .target = NULL, + }, + DATA_STAGE(flow_first_free), + DATA_STAGE(flowtab), + DATA_STAGE(flow_hashtab), + DATA_STAGE(g_hash_secret),
...so this one for g_hash_secret (which is the abomination I wanted to drop) would eventually turn into a function?
Yes.
It looks neat, I'm just not sure if we'll really have "data stages" after I change the approach to only transfer the data we need as you suggested.
I agree, that may well be the case, but we can just drop the macro and helepr functions then.
Is that part of your pending queue by the way, or should I go ahead with it?
Is which part of my pending queue? g_hash_secret as a function? No, I've thought of it, but I haven't written it yet.
No, I meant: "transfer the data we need".
No, I haven't begun actually writing it.
[...]
/** - * struct migrate_data - Data sections for given source version - * @v: Source version this applies to, host order - * @sections: Array of data sections, NULL-terminated + * migrate_cb_t - Callback function to implement one migration stage */ -struct migrate_data { - uint32_t v; - struct iovec *sections; -}; +typedef int (*migrate_cb_t)(struct ctx *c, struct migrate_meta *m, + const struct migrate_stage *stage, int fd);
typedef is really annoying, we already have a flow_sidx which kind of made sense, but this has really no advantage over something like:
Eh, I guess. I find the extra . or -> a little annoying, but we can do this if you'd prefer.
Well, I find not finding things because of the typedef quite annoying.
Actually, we don't even need a struct if you prefer (I actually do, but I know not everybody finds the syntax for function pointers convenient).
I'm re-posting this squashed onto previous patches but it would be nice if you could re-post a version of the whole thing without the typedef.
Ok. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
participants (2)
-
David Gibson
-
Stefano Brivio