Re: [PATCH] tap: Drop frames if no client connected

25 Sep 2025

      On Thu, 25 Sep 2025 13:08:35 +0800
Yumei Huang  wrote:
...
On Wed, Sep 24, 2025 at 5:56 PM Stefano Brivio  wrote:
...
On Wed, 24 Sep 2025 11:49:28 +1000
David Gibson  wrote:
...
So... summarising.  As I see it, we have two main cases to consider:
the one where the guest comes online pretty soon, and the one where it
doesn't.  Here's what I think the behaviour would be for these two
cases with a variety of ways of handling it.  This is more-or-less
from the peer's perspective.
(0) Physicaly disconnected guest (bridged network, no passt involved)
(0a) Guest online never
        SYN ... SYN ... SYN ... <peer times out>
(0b) Guest online soonish
        SYN ... SYN ... SYN-ACK, ACK <working connection>
(1) Status quo
Passt doesn't resend SYNs, and will time out the connection after 10s.
(1a) Guest online never
        SYN, SYN-ACK, ACK ... ... ... ... <passt times out> RST
(0b) Guest online soonish
        SYN, SYN-ACK, ACK ... ... ... ... <passt times out> RST
(2) Yumei's patch
As (1), but without EBADFs
(3) passt resends SYNs
(3a) Guest online never
        SYN, SYN-ACK, ACK ... ... ... ... ... <passt times out> RST
(3b) Guest online soonish
        SYN, SYN-ACK, ACK ... ... ... ... <working connection>
(4) Passt resends SYNs + Yumei's patch
As (3), but without EBADFs
(5) passt explicitly resets when guest is not present
(6a) Guest online never
        SYN, SYN-ACK, ACK, RST
(6b) Guest online soonish
        SYN, SYN-ACK, ACK, RST
(6) Delayed listen()
(6a) Guest online never
        SYN, RST
(6b) Guest online soonish
        SYN, RST
(99) Bridged guest isn't listening (no passt)
(99a) Guest online never
        SYN, RST
(99b) Guest online soonish
        SYN, RST
=====
It all makes sense, thanks for summarising those.
...
So, if (99) is our model, we can match it pretty exactly with delayed
listen().  But if (0) is our model, the closest we can get is (3) or
(4), which I think will look fairly similar to peer application, even
though it looks different to the peer TCP stack.
I think (0) is a better model, because it means we won't reset
connections if they happen to land when a still running guest has its
connection to passt temporarily interrupted.
Which brings me, I think, to the same conclusion you had: we should
resend SYNs.
Suggested next steps:
 - Apply Yumei's patch, it doesn't change behaviour and removes the
   odd EBADFs
 - Yumei investigates implementing SYN resends
Right, that also makes sense to me.
Glad we reached an agreement here. BTW, in case you missed it, the v2
patch was sent as
https://archives.passt.top/passt-dev/20250912081705.20796-1-yuhuang@redhat.c....
I never miss patches. :) No worries, I just got a few interruptions in
a row but I plan to apply it soon.
...
...
For the second part, we could probably reuse a mechanism similar to
what we do for re-transmits, and perhaps rename 'retrans' in struct
tcp_tap_conn to 'retries', so that we can use it for both (we're a bit
tight on space there).
I got an initial thought about calling tcp_send_flag() in
tcp_flow_defer(). But it seems not working. Trying to figure that
out..
That might work, even though, I guess, the most natural alternative
would be to change the handling of an expired SYN_TIMEOUT in
tcp_timer_handler(). Look at this case:

	} else if (conn->flags & ACK_FROM_TAP_DUE) {
		if (!(conn->events & ESTABLISHED)) {
			flow_dbg(conn, "handshake timeout");

...it should become a bit more like this one:

		} else {
			flow_dbg(conn, "ACK timeout, retry");
			conn->retrans++;
			...

where we retry for a few times, before resetting the connection.

With timers, you already have timed triggers, as opposed to trying
things out periodically from tcp_flow_defer().

-- 
Stefano