[PATCH 0/8] Fix a number of bugs with handling of TCP packets from tap
In the course of investigating bug 68, I discovered a number of pretty serious bugs in how we handle various cases in tcp_tap_handler() and tcp_data_from_tap(). This series fixes a number of them. Note that while I'm pretty sure the bugs fixed here are real, I haven't yet positively traced how they lead to the symptoms in bug 68 - I'm still waiting on the results from some special instrumentation to track that down. Link: https://bugs.passt.top/show_bug.cgi?id=68 David Gibson (8): tcp, tap: Correctly advance through packets in tcp_tap_handler() udp, tap: Correctly advance through packets in udp_tap_handler() tcp: Remove some redundant packet_get() operations tcp: Never hash match closed connections tcp: Return consumed packet count from tcp_data_from_tap() tcp: Correctly handle RST followed rapidly by SYN tcp: Consolidate paths where we initiate reset on tap interface tcp: Correct handling of FIN,ACK followed by SYN tap.c | 29 ++++++++++-------- tcp.c | 98 +++++++++++++++++++++++++++++++---------------------------- tcp.h | 2 +- udp.c | 15 ++++----- udp.h | 2 +- 5 files changed, 78 insertions(+), 68 deletions(-) -- 2.41.0
In both tap4_handler() and tap6_handler(), once we've sorted incoming l3
packets into "sequences", we then step through all the packets in each TCP
sequence calling tcp_tap_handler(). Or so it appears.
In fact, tcp_tap_handler() doesn't take an index and always looks at packet
0 of the sequence, except when it calls tcp_data_from_tap() to process
data packets. It appears to be written with the idea that the struct pool
is a queue, from which it consumes packets as it processes them, but that's
not how the pool data structure works - they are more like an array of
packets.
We only get away with this, because setup packets for TCP tend to come in
separate batches (because we need to reply in between) and so we only get
a bunch of packets for the same connection together when they're data
packets (tcp_data_from_tap() has its own loop through packets).
Correct this by adding an index parameter to tcp_tap_handler() and altering
the loops in tap.c to step through the pool properly.
Link: https://bugs.passt.top/show_bug.cgi?id=68
Signed-off-by: David Gibson
In both tap4_handler() and tap6_handler(), once we've sorted incoming l3
packets into "sequences", we then step through all the packets in each DUP
sequence calling udp_tap_handler(). Or so it appears.
In fact, udp_tap_handler() doesn't take an index and always starts with
packet 0 of the sequence, even if called repeatedly. It appears to be
written with the idea that the struct pool is a queue, from which it
consumes packets as it processes them, but that's not how the pool data
structure works.
Correct this by adding an index parameter to udp_tap_handler() and altering
the loops in tap.c to step through the pool properly.
Signed-off-by: David Gibson
Both tcp_data_from_tap() and tcp_tap_handler() call packet_get() to get
the entire L4 packet length, then immediately call it again to check that
the packet is long enough to include a TCP header. The features of
packet_get() let us easily combine these together, we just need to adjust
the length slightly, because we want the value to include the TCP header
length.
Signed-off-by: David Gibson
From a practical point of view, when a TCP connection ends, whether by
FIN or by RST, we set the CLOSED event, then some time later we remove the
connection from the hash table and clean it up. However, from a protocol
point of view, once it's closed, it's gone, and any new packets with
matching addresses and ports are either forming a new connection, or are
invalid packets to discard.
Enforce these semantics in the TCP hash logic by never hash matching closed
connections.
Signed-off-by: David Gibson
Currently tcp_data_from_tap() is assumed to consume all packets remaining
in the packet pool it is given. However there are some edge cases where
that's not correct. In preparation for fixing those, change it to return
a count of packets consumed and use that in its caller.
Signed-off-by: David Gibson
Although it's unlikely in practice, the guest could theoretically
reset one TCP connection then immediately start a new one with the
same addressses and ports, such that we get an RST then a SYN in the
same batch of received packets in tcp_tap_handler().
We don't correctly handle that unlikely case, because when we receive
the RST, we discard any remaining packets in the batch so we'd never
see the SYN. This could happen in either tcp_tap_handler() or
tcp_data_from_tap(). Correct that by returning 1, so that the caller
will continue calling tcp_tap_handler() on subsequent packets allowing
us to process any subsequent SYN.
Signed-off-by: David Gibson
There are a number of conditions where we will issue a TCP RST in response
to something unexpected we received from the tap interface. These occur in
both tcp_data_from_tap() and tcp_tap_handler(). In tcp_tap_handler() use
a 'goto out of line' technique to consolidate all these paths into one
place. For the tcp_data_from_tap() cases use a negative return code and
direct that to the same path in tcp_tap_handler(), its caller.
In this case we want to discard all remaining packets in the batch we have
received: even if they're otherwise good, they'll be invalidated when the
guest receives the RST we're sending. This is subtly different from the
case where we *receive* an RST, where we could in theory get a new SYN
immediately afterwards. Clarify that with a common on the now common
reset path.
Signed-off-by: David Gibson
When the guest tries to establish a connection, it could give up on it by
sending a FIN,ACK instead of a plain ACK to our SYN,ACK. It could then
make a new attempt to establish a connection with the same addresses and
ports with a new SYN.
Although it's unlikely, it could send the 2nd SYN very shortly after the
FIN,ACK resulting in both being received in the same batch of packets from
the tap interface.
Currently, we don't handle that correctly, when we receive a FIN,ACK on a
not fully established connection we discard the remaining packets in the
batch, and so will never process the 2nd SYN. Correct this by returning
1 from tcp_tap_handler() in this case, so we'll just consume the FIN,ACK
and continue to process the rest of the batch.
Signed-off-by: David Gibson
On Fri, 8 Sep 2023 11:49:45 +1000
David Gibson
In the course of investigating bug 68, I discovered a number of pretty serious bugs in how we handle various cases in tcp_tap_handler() and tcp_data_from_tap(). This series fixes a number of them.
Note that while I'm pretty sure the bugs fixed here are real, I haven't yet positively traced how they lead to the symptoms in bug 68 - I'm still waiting on the results from some special instrumentation to track that down.
Link: https://bugs.passt.top/show_bug.cgi?id=68
David Gibson (8): tcp, tap: Correctly advance through packets in tcp_tap_handler() udp, tap: Correctly advance through packets in udp_tap_handler() tcp: Remove some redundant packet_get() operations tcp: Never hash match closed connections tcp: Return consumed packet count from tcp_data_from_tap() tcp: Correctly handle RST followed rapidly by SYN tcp: Consolidate paths where we initiate reset on tap interface tcp: Correct handling of FIN,ACK followed by SYN
Series applied, thanks. -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio