From Red Hat internal testing we've had some reports that if attempting to migrate without passt-repair, the failure mode is uglier than we'd like. The migration fails, which is somewhat expected, but we don't correctly roll things back on the source, so it breaks network there as well. Address this and several other fragilities in the migration. Everything tested with the basic test suite, plus some additional testing for the specific functionality of the patches: For patches 1..2: * I get a clean migration if there are now active flows * Migration completes, although connections are broken if passt-repair isn't connected For patches 3..5: * Migration completes ok if the source and destination hosts have different IPs * Target correctly sees the bind() failure and abandons the flow * Unfortunately, target-side client doesn't get a reset, it just sits there not working. This is because a) the RST we try to send is lost because the queue is still inactive over the migration and b) we don't send RSTs or ICMPs for packets from the guest which no longer match a flow (I hope to tackle this soon) * After manually quitting the stuck client on the target, other connectivity works There are more fragile cases that I'm looking to fix, particularly the die()s in flow_migrate_source_rollback() and elsewhere. David Gibson (5): migrate, flow: Trivially succeed if migrating with no flows migrate, flow: Don't attempt to migrate TCP flows without passt-repair tcp: Correct error code handling from tcp_flow_repair_socket() tcp: Unconditionally move to CLOSED state on tcp_rst() migrate, tcp: Don't flow_alloc_cancel() during incoming migration flow.c | 17 +++++++++++++++-- tcp.c | 28 +++++++++++++++++++++------- 2 files changed, 36 insertions(+), 9 deletions(-) -- 2.48.1
We could get a migration request when we have no active flows; or at least none that we need or are able to migrate. In this case after sending or receiving the number of flows we continue to step through various lists. In the target case, this could include communication with passt-repair. If passt-repair wasn't started that could cause further errors, but of course they shouldn't matter if we have nothing to repair. Make it more obvious that there's nothing to do and avoid such errors by short-circuiting flow_migrate_{source,target}() if there are no migratable flows. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- flow.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/flow.c b/flow.c index bb5dcc3c..6cf96c26 100644 --- a/flow.c +++ b/flow.c @@ -999,6 +999,9 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, debug("Sending %u flows", ntohl(count)); + if (!count) + return 0; + /* Dump and send information that can be stored in the flow table. * * Limited rollback options here: if we fail to transfer any data (that @@ -1070,6 +1073,9 @@ int flow_migrate_target(struct ctx *c, const struct migrate_stage *stage, count = ntohl(count); debug("Receiving %u flows", count); + if (!count) + return 0; + if ((rc = flow_migrate_repair_all(c, true))) return -rc; -- 2.48.1
Migrating TCP flows requires passt-repair in order to use TCP_REPAIR. If passt-repair is not started, our failure mode is pretty ugly though: we'll attempt the migration, hitting various problems when we can't enter repair mode. In some cases we may not roll back these changes properly, meaning we break network connections on the source. Our general approach is not to completely block migration if there are problems, but simply to break any flows we can't migrate. So, if we have no connection from passt-repair carry on with the migration, but don't attempt to migrate any TCP connections. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- flow.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/flow.c b/flow.c index 6cf96c26..749c4984 100644 --- a/flow.c +++ b/flow.c @@ -923,6 +923,10 @@ static int flow_migrate_repair_all(struct ctx *c, bool enable) union flow *flow; int rc; + /* If we don't have a repair helper, there's nothing we can do */ + if (c->fd_repair < 0) + return 0; + foreach_established_tcp_flow(flow) { if (enable) rc = tcp_flow_repair_on(c, &flow->tcp); @@ -987,8 +991,11 @@ int flow_migrate_source(struct ctx *c, const struct migrate_stage *stage, (void)c; (void)stage; - foreach_established_tcp_flow(flow) - count++; + /* If we don't have a repair helper, we can't migrate TCP flows */ + if (c->fd_repair >= 0) { + foreach_established_tcp_flow(flow) + count++; + } count = htonl(count); if (write_all_buf(fd, &count, sizeof(count))) { -- 2.48.1
There are two small bugs in error returns from tcp_low_repair_socket(), which is supposed to return a negative errno code: 1) On bind() failures, wedirectly pass on the return code from bind(), which is just 0 or -1, instead of an error code. 2) In the caller, tcp_flow_migrate_target() we call strerror_() directly on the negative error code, but strerror() requires a positive error code. Correct both of these. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tcp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tcp.c b/tcp.c index e3c0a53b..8528ee35 100644 --- a/tcp.c +++ b/tcp.c @@ -3280,7 +3280,8 @@ int tcp_flow_repair_socket(struct ctx *c, struct tcp_tap_conn *conn) tcp_sock_set_nodelay(s); - if ((rc = bind(s, &a.sa, sizeof(a)))) { + if (bind(s, &a.sa, sizeof(a))) { + rc = -errno; err_perror("Failed to bind socket for migrated flow"); goto err; } @@ -3375,7 +3376,7 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) conn->seq_init_from_tap = ntohl(t.seq_init_from_tap); if ((rc = tcp_flow_repair_socket(c, conn))) { - flow_err(flow, "Can't set up socket: %s, drop", strerror_(rc)); + flow_err(flow, "Can't set up socket: %s, drop", strerror_(-rc)); flow_alloc_cancel(flow); return 0; } -- 2.48.1
tcp_rst() attempts to send an RST packet to the guest, and if that succeeds moves the flow to CLOSED state. However, even if the tcp_send_flag() fails the flow is still dead: we've usually closed the socket already, and something has already gone irretrievably wrong. So we should still mark the flow as CLOSED. That will cause it to be cleaned up, meaning any future packets from the guest for it won't match a flow, so should generate new RSTs (they don't at the moment, but that's a separate bug). Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tcp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tcp.c b/tcp.c index 8528ee35..d23b6d94 100644 --- a/tcp.c +++ b/tcp.c @@ -1214,8 +1214,8 @@ void tcp_rst_do(const struct ctx *c, struct tcp_tap_conn *conn) if (conn->events == CLOSED) return; - if (!tcp_send_flag(c, conn, RST)) - conn_event(c, conn, CLOSED); + tcp_send_flag(c, conn, RST); + conn_event(c, conn, CLOSED); } /** -- 2.48.1
In tcp_flow_migrate_target(), if we're unable to create and bind the new socket, we print an error, cancel the flow and carry on. This seems to make sense based on our policy of generally letting the migration complete even if some or all flows are lost in the process. But it doesn't quite work: the flow_alloc_cancel() means that the flows in the target's flow table are no longer one to one match to the flows which the source is sending data for. This means that data for later flows will be mismatched to a different flow. Most likely that will cause some nasty error later, but even worse it might appear to succeed but lead to data corruption due to incorrectly restoring one of the flows. Instead, we should leave the flow in the table until we've read all the data for it, *then* discard it. Technically removing the flow_alloc_cancel() would be enough for this: if tcp_flow_repair_socket() fails it leaves conn->sock == -1, which will cause the restore functions in tcp_flow_migrate_target_ext() to fail, discarding the flow. To make what's going on clearer (and with less extraneous error messages), put several explicit tests for a missing socket later in the migration path to read the data associated with the flow but explicitly discard it. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tcp.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/tcp.c b/tcp.c index d23b6d94..b3aa9a2c 100644 --- a/tcp.c +++ b/tcp.c @@ -2708,6 +2708,9 @@ int tcp_flow_repair_on(struct ctx *c, const struct tcp_tap_conn *conn) { int rc = 0; + if (conn->sock < 0) + return 0; + if ((rc = repair_set(c, conn->sock, TCP_REPAIR_ON))) err("Failed to set TCP_REPAIR"); @@ -2725,6 +2728,9 @@ int tcp_flow_repair_off(struct ctx *c, const struct tcp_tap_conn *conn) { int rc = 0; + if (conn->sock < 0) + return 0; + if ((rc = repair_set(c, conn->sock, TCP_REPAIR_OFF))) err("Failed to clear TCP_REPAIR"); @@ -3377,7 +3383,8 @@ int tcp_flow_migrate_target(struct ctx *c, int fd) if ((rc = tcp_flow_repair_socket(c, conn))) { flow_err(flow, "Can't set up socket: %s, drop", strerror_(-rc)); - flow_alloc_cancel(flow); + /* Can't leave the flow in an incomplete state */ + FLOW_ACTIVATE(conn); return 0; } @@ -3453,6 +3460,10 @@ int tcp_flow_migrate_target_ext(struct ctx *c, struct tcp_tap_conn *conn, int fd return rc; } + if (conn->sock < 0) + /* We weren't able to create the socket, discard flow */ + goto fail; + if (tcp_flow_select_queue(s, TCP_SEND_QUEUE)) goto fail; @@ -3540,8 +3551,10 @@ int tcp_flow_migrate_target_ext(struct ctx *c, struct tcp_tap_conn *conn, int fd return 0; fail: - tcp_flow_repair_off(c, conn); - repair_flush(c); + if (conn->sock >= 0) { + tcp_flow_repair_off(c, conn); + repair_flush(c); + } conn->flags = 0; /* Not waiting for ACK, don't schedule timer */ tcp_rst(c, conn); -- 2.48.1
On Thu, 27 Feb 2025 16:55:12 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:From Red Hat internal testing we've had some reports that if attempting to migrate without passt-repair, the failure mode is uglier than we'd like. The migration fails, which is somewhat expected, but we don't correctly roll things back on the source, so it breaks network there as well. Address this and several other fragilities in the migration. Everything tested with the basic test suite, plus some additional testing for the specific functionality of the patches: For patches 1..2: * I get a clean migration if there are now active flows * Migration completes, although connections are broken if passt-repair isn't connected For patches 3..5: * Migration completes ok if the source and destination hosts have different IPs * Target correctly sees the bind() failure and abandons the flow * Unfortunately, target-side client doesn't get a reset, it just sits there not working. This is because a) the RST we try to send is lost because the queue is still inactive over the migration and b) we don't send RSTs or ICMPs for packets from the guest which no longer match a flow (I hope to tackle this soon) * After manually quitting the stuck client on the target, other connectivity worksApplied. I didn't manage to try out the setup you proposed yet, but I tried things out in my partial flow (with migration succeeding but connections closing right away) and everything looks fine there. -- Stefano