Flow table entries need to be fully initialised before returning to the main epoll loop. Commit 0060acd1 ("flow: Clarify and enforce flow state transitions") now enforces that: once a flow is allocated we must either cancel it, or activate it before returning to the main loop, or we will hit an ASSERT(). Some error paths in tcp_conn_from_tap() weren't correctly updated for this requirement - we can exit with a flow entry incompletely initialised. Correct that by cancelling the flows in those situations. I don't have enough information to be certain if this is the cause for podman bug 22925, but it plausibly could be. Fixes: 0060acd1 ("flow: Clarify and enforce flow state transitions") Link: https://github.com/containers/podman/issues/22925 Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tcp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tcp.c b/tcp.c index 0dccde9c..cb613cf3 100644 --- a/tcp.c +++ b/tcp.c @@ -2067,7 +2067,7 @@ static void tcp_conn_from_tap(struct ctx *c, sa_family_t af, if (!bind(s, sa, sl)) { tcp_rst(c, conn); /* Nobody is listening then */ - return; + goto cancel; } if (errno != EADDRNOTAVAIL && errno != EACCES) conn_flag(c, conn, LOCAL); @@ -2080,7 +2080,7 @@ static void tcp_conn_from_tap(struct ctx *c, sa_family_t af, if (connect(s, sa, sl)) { if (errno != EINPROGRESS) { tcp_rst(c, conn); - return; + goto cancel; } tcp_get_sndbuf(conn); @@ -2088,7 +2088,7 @@ static void tcp_conn_from_tap(struct ctx *c, sa_family_t af, tcp_get_sndbuf(conn); if (tcp_send_flag(c, conn, SYN | ACK)) - return; + goto cancel; conn_event(c, conn, TAP_SYN_ACK_SENT); } -- 2.45.2
On Fri, 7 Jun 2024 11:55:24 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:Flow table entries need to be fully initialised before returning to the main epoll loop. Commit 0060acd1 ("flow: Clarify and enforce flow state transitions") now enforces that: once a flow is allocated we must either cancel it, or activate it before returning to the main loop, or we will hit an ASSERT(). Some error paths in tcp_conn_from_tap() weren't correctly updated for this requirement - we can exit with a flow entry incompletely initialised. Correct that by cancelling the flows in those situations. I don't have enough information to be certain if this is the cause for podman bug 22925, but it plausibly could be. Fixes: 0060acd1 ("flow: Clarify and enforce flow state transitions") Link: https://github.com/containers/podman/issues/22925 Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au>Applied. -- Stefano