We, like libsoccr handle restoring the send queue in two pieces. The "already sent" piece is written in repair mode, to repopulate the kernel sndbuf without actually sending new packets to the peer. The "not sent" piece is written out of repair mode, so that we both put it into sndbuf *and* actually send it out. To do that we temporarily drop out of repair mode. However we do so before we've called TCP_REPAIR_WINDOW, meaning we're doing real send()s on a real, non-repair socket that has bad window information. That seems bad. Despite it differing from libsoccr, move the sending of the non sent queued data until after the *final* repair off. I strongly suspect that both we and libsoccr were only (kind of) getting away with this because notsent is usually 0. This seems to fix an intermittent hang I was seeing on migrate/iperf3_bidir6. I was seeing that perhaps 1 time in 3, or 1 time in 5 with DEBUG=1. I did observe a nonzero notsent the one time I reproduced after I knew what I was looking for. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tcp.c | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/tcp.c b/tcp.c index f18b2913..2a64b7c5 100644 --- a/tcp.c +++ b/tcp.c @@ -3530,25 +3530,22 @@ int tcp_flow_migrate_target_ext(struct ctx *c, union flow *flow, int fd) shutdown(s, SHUT_WR); } + if ((rc = tcp_flow_repair_wnd(s, &t))) + return rc; + + tcp_flow_repair_off(c, conn); + repair_flush(c); + if (t.notsent) { - tcp_flow_repair_off(c, conn); - repair_flush(c); + err("socket %i, t.sndq=%u t.notsent=%u", + s, t.sndq, t.notsent); if ((rc = tcp_flow_repair_queue(s, t.notsent, tcp_migrate_snd_queue + (t.sndq - t.notsent)))) return rc; - - tcp_flow_repair_on(c, conn); - repair_flush(c); } - if ((rc = tcp_flow_repair_wnd(s, &t))) - return rc; - - tcp_flow_repair_off(c, conn); - repair_flush(c); - /* If we sent a FIN but it wasn't acknowledged yet (TCP_FIN_WAIT1), send * it out, because we don't know if we already sent it. * -- 2.48.1