[PATCH v2 0/7] Flow Table Preliminaries
I'm still working on bunch of things to start implementing the generalised flow table. However, I think this set of preliminary clean ups and fixes stand well enough on their own that they're ready for merge now. Based on the epoll patch series. Changes since v1: * Add missing patch moving in_epoll flag David Gibson (7): tap: Don't clobber source address in tap6_handler() tap: Pass source address to protocol handler functions tcp: More precise terms for addresses and ports tcp: Consistent usage of ports in tcp_seq_init() tcp, udp: Don't include destination address in partially precomputed csums tcp, udp: Don't pre-fill IPv4 destination address in headers tcp: Move in_epoll flag out of common connection structure icmp.c | 12 ++-- icmp.h | 3 +- passt.c | 10 ++- passt.h | 4 +- pasta.c | 2 +- tap.c | 29 ++++---- tcp.c | 194 +++++++++++++++++++++++---------------------------- tcp.h | 5 +- tcp_conn.h | 18 ++--- tcp_splice.c | 4 +- udp.c | 37 ++++------ udp.h | 5 +- util.h | 4 +- 13 files changed, 151 insertions(+), 176 deletions(-) -- 2.41.0
In tap6_handler() saddr is initialized to the IPv6 source address from the
incoming packet. However part way through, but before organizing the
packet into a "sequence" we set it unconditionally to the guest's assigned
address. We don't do anything equivalent for IPv4.
This doesn't make a lot of sense: if the guest is using a different source
address it makes sense to consider these different sequences of packets and
we shouldn't try to combine them together.
Signed-off-by: David Gibson
The tap code passes the IPv4 or IPv6 destination address of packets it
receives to the protocol specific code. Currently that protocol code
doesn't use the source address, but we want it to in future. So, in
preparation, pass the IPv4/IPv6 source address of tap packets to those
functions as well.
Signed-off-by: David Gibson
In a number of places the comments and variable names we use to describe
addresses and ports are ambiguous. It's not sufficient to describe a port
as "tap-facing" or "socket-facing", because on both the tap side and the
socket side there are two ports for the two ends of the connection.
Similarly, "local" and "remote" aren't particularly helpful, because it's
not necessarily clear whether we're talking from the point of view of the
guest/namespace, the host, or passt itself.
This patch makes a number of changes to be more precise about this. It
introduces two new terms in aid of this:
A "forwarding" address (or port) refers to an address which is local
from the point of view of passt itself. That is a source address for
traffic sent by passt, whether it's to the guest via the tap interface
or to a host on the internet via a socket.
The "endpoint" address (or port) is the reverse: a remote address
from passt's point of view, the destination address for traffic sent
by passt.
Between them the "side" (either tap/guest-facing or sock/host-facing)
and forwarding vs. endpoint unambiguously describes which address or
port we're talking about.
Signed-off-by: David Gibson
In tcp_seq_init() the meaning of "src" and "dst" isn't really clear since
it's used for connections in both directions. However, these values are
just feeding a hash, so as long as we're consistent and include all the
information we want, it doesn't really matter.
Oddly, for the "src" side we supply the (tap side) forwarding address but
the (tap side) endpoint port. This again doesn't really matter, but it's
confusing. So swap this with dstport, so "src" is always forwarding
and "dst" is always endpoint.
Signed-off-by: David Gibson
We partially prepopulate IP and TCP header structures including, amongst
other things the destination address, which for IPv4 is always the known
address of the guest/namespace. We partially precompute both the IPv4
header checksum and the TCP checksum based on this.
In future we're going to want more flexibility with controlling the
destination for IPv4 (as we already do for IPv6), so this precomputed value
gets in the way. Therefore remove the IPv4 destination from the
precomputed checksum and fold it into the checksum update when we actually
send a packet.
Doing this means we no longer need to recompute those partial sums when
the destination address changes ({tcp,udp}_update_l2_buf()) and instead
the computation can be moved to compile time. This means while we perform
slightly more computations on each packet, we slightly reduce the amount of
memory we need to access.
Signed-off-by: David Gibson
Because packets sent on the tap interface will always be going to the
guest/namespace, we more-or-less know what address they'll be going to. So
we pre-fill this destination address in our header buffers for IPv4. We
can't do the same for IPv6 because we could need either the global or
link-local address for the guest. In future we're going to want more
flexibility for the destination address, so this pre-filling will get in
the way.
Change the flow so we always fill in the IPv4 destination address for each
packet, rather than prefilling it from proto_update_l2_buf(). In fact for
TCP we already redundantly filled the destination for each packet anyway.
Signed-off-by: David Gibson
The in_epoll boolean is one of only two fields (currently) in the common
structure shared between tap and spliced connections. It seems like it
belongs there, because both tap and spliced connections use it, and it has
roughly the same meaning.
Roughly, however, isn't exactly: which fds this flag says are in the epoll
varies between the two connection types, and are in type specific fields.
So, it's only possible to meaningfully use this value locally in type
specific code anyway.
This common field is going to get in the way of more widespread
generalisation of connection / flow tracking, so move it to separate fields
in the tap and splice specific structures.
Signed-off-by: David Gibson
On Fri, Aug 11, 2023 at 10:53:58PM +1000, David Gibson wrote:
I'm still working on bunch of things to start implementing the generalised flow table. However, I think this set of preliminary clean ups and fixes stand well enough on their own that they're ready for merge now.
Based on the epoll patch series.
JSYK, I will be sending another spin of this, with at least 2 additional patches. The only changes I'm anticipating in the existing patches is a trivial formatting tweak, so it shouldn't invalidate review on this version.
Changes since v1: * Add missing patch moving in_epoll flag
David Gibson (7): tap: Don't clobber source address in tap6_handler() tap: Pass source address to protocol handler functions tcp: More precise terms for addresses and ports tcp: Consistent usage of ports in tcp_seq_init() tcp, udp: Don't include destination address in partially precomputed csums tcp, udp: Don't pre-fill IPv4 destination address in headers tcp: Move in_epoll flag out of common connection structure
icmp.c | 12 ++-- icmp.h | 3 +- passt.c | 10 ++- passt.h | 4 +- pasta.c | 2 +- tap.c | 29 ++++---- tcp.c | 194 +++++++++++++++++++++++---------------------------- tcp.h | 5 +- tcp_conn.h | 18 ++--- tcp_splice.c | 4 +- udp.c | 37 ++++------ udp.h | 5 +- util.h | 4 +- 13 files changed, 151 insertions(+), 176 deletions(-)
-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
participants (1)
-
David Gibson