[PATCH 0/8] RFC: Generalize flow tracking, part 1
This is a first draft of the first part of implementing a more general flow table (connection tracking) as described at: https://pad.passt.top/p/NewForwardingModel This is by no means complete. So far it doesn't really do anything new, it just reorganizes the TCP connection table to be closer to the more general flow table. Still it's ready for preliminary review. David Gibson (8): tap: Don't clobber source address in tap6_handler() tap: Pass source address to protocol handler functions tcp: More precise terms for addresses and ports tcp, udp: Don't include destination address in partially precomputed csums tcp, udp: Don't pre-fill IPv4 destination address in headers tcp: Track guest-side correspondent address tcp, flow: Introduce struct demiflow tcp, flow: Perform TCP hash calculations based on demiflow structure flow.h | 66 +++++++++++++++ icmp.c | 12 ++- icmp.h | 3 +- passt.c | 10 +-- passt.h | 4 +- pasta.c | 2 +- siphash.c | 1 + tap.c | 29 ++++--- tcp.c | 227 ++++++++++++++++++++------------------------------- tcp.h | 5 +- tcp_conn.h | 9 +- tcp_splice.c | 2 + udp.c | 37 +++------ udp.h | 5 +- util.h | 4 +- 15 files changed, 209 insertions(+), 207 deletions(-) create mode 100644 flow.h -- 2.41.0
In tap6_handler() saddr is initialized to the IPv6 source address from the
incoming packet. However part way through, but before organizing the
packet into a "sequence" we set it unconditionally to the guest's assigned
address. We don't do anything equivalent for IPv4.
This doesn't make a lot of sense: if the guest is using a different source
address it makes sense to consider these different sequences of packets and
we shouldn't try to combine them together.
Signed-off-by: David Gibson
The tap code passes the IPv4 or IPv6 destination address of packets it
receives to the protocol specific code. Currently that protocol code
doesn't use the source address, but we want it to in future. So, in
preparation, pass the IPv4/IPv6 source address of tap packets to those
functions as well.
Signed-off-by: David Gibson
In a number of places the comments and variable names we use to describe
addresses and ports are ambiguous. It's not sufficient to describe a port
as "tap-facing" or "socket-facing", because on both the tap side and the
socket side there are two ports for the two ends of the connection.
Similarly, "local" and "remote" aren't particularly helpful, because it's
not necessarily clear whether we're talking from the point of view of the
guest/namespace, the host, or passt itself.
This patch makes a number of changes to be more precise about this. It
introduces two new terms in aid of this:
A "forwarding" address (or port) refers to an address which is local
from the point of view of passt itself. That is a source address for
traffic sent by passt, whether it's to the guest via the tap interface
or to a host on the internet via a socket.
The "correspondent" address (or port) is the reverse: a remote address
from passt's point of view, the destination address for traffic sent by
passt.
Between them the "side" (either tap/guest-facing or sock/host-facing) and
forwarding/correspondent unambiguously describes which address or port
we're talking about.
Signed-off-by: David Gibson
We partially prepopulate IP and TCP header structures including, amongst
other things the destination address, which for IPv4 is always the known
address of the guest/namespace. We partially precompute both the IPv4
header checksum and the TCP checksum based on this.
In future we're going to want more flexibility with controlling the
destination for IPv4 (as we already do for IPv6), so this precomputed value
gets in the way. Therefore remove the IPv4 destination from the
precomputed checksum and fold it into the checksum update when we actually
send a packet.
Doing this means we no longer need to recompute those partial sums when
the destination address changes ({tcp,udp}_update_l2_buf()) and instead
the computation can be moved to compile time. This means while we perform
slightly more computations on each packet, we slightly reduce the amount of
memory we need to access.
Signed-off-by: David Gibson
Because packets sent on the tap interface will always be going to the
guest/namespace, we more-or-less know what address they'll be going to. So
we pre-fill this destination address in our header buffers for IPv4. We
can't do the same for IPv6 because we could need either the global or
link-local address for the guest. In future we're going to want more
flexibility for the destination address, so this pre-filling will get in
the way.
Change the flow so we always fill in the IPv4 destination address for each
packet, rather than prefilling it from proto_update_l2_buf(). In fact for
TCP we already redundantly filled the destination for each packet anyway.
Signed-off-by: David Gibson
Currently the only address we explicitly track in the TCP connection
structure is the tap side forwarding address - that is the remote address
from the guest's point of view. The tap side correspondent address - the
local address from the guest's point of view - is assumed to always be one
of the handful of guest addresses we track as addr_seen (one each for IPv4,
IPv6 global and IPv6 link-local).
We want to generalize our forwarding model to allow the guest to have
multiple addresses. As a start on this, track the tap-side correspondent
address in the connection structure, only using one of the addr_seen
variables when we start a new connection.
Signed-off-by: David Gibson
For TCP tap connections we keep track of both the IP address and port for
each side of a connection as seen by the guest. We're planning to track
similar information in a number of other places as well.
To assist with this, create a new structure: struct demiflow to track both
sides of a connection or other logical packet flow as seen from a single
"side" of passt. Also add a small helper function for initializing this
structure.
Signed-off-by: David Gibson
Currently we match TCP packets received on the tap connection to a TCP
connection via a hash table based on the forwarding address and both ports.
We hope in future to allow for multiple guest side addresses, which means
we may need to distinguish based on the correspondent address as well.
Extend the hash function to include this information. Since this now
exactly corresponds to the contents of the guest-side demiflow, we can base
our hash functions on that, rather than a group of individual parameters.
We also put some of the helpers in flow.h, because we hope to be able to
re-use the hashing logic for other cases in future as well.
Signed-off-by: David Gibson
participants (1)
-
David Gibson