On Thu, 18 Jan 2024 12:15:12 +1100
David Gibson <david(a)gibson.dropbear.id.au> wrote:
On Wed, Jan 17, 2024 at 08:59:34PM +0100, Stefano
Brivio wrote:
On Thu, 21 Dec 2023 18:02:28 +1100
David Gibson <david(a)gibson.dropbear.id.au> wrote:
Currently we match TCP packets received on the
tap connection to a TCP
connection via a hash table based on the forwarding address and both
ports. We hope in future to allow for multiple guest side addresses, or
for multiple interfaces which means we may need to distinguish based on
the endpoint address and pif as well. We also want a unified hash table
to cover multiple protocols, not just TCP.
Replace the TCP specific hash function with one suitable for general flows,
or rather for one side of a general flow. This includes all the
information from struct flowside, plus the L4 protocol number.
Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au>
---
flow.c | 21 +++++++++++++++++++++
flow.h | 19 +++++++++++++++++++
tcp.c | 59 +++++++++++-----------------------------------------------
3 files changed, 51 insertions(+), 48 deletions(-)
diff --git a/flow.c b/flow.c
index bc8cfc6..263633e 100644
--- a/flow.c
+++ b/flow.c
@@ -229,6 +229,27 @@ void flow_alloc_cancel(union flow *flow)
flow_first_free = FLOW_IDX(flow);
}
+/**
+ * flow_hash() - Calculate hash value for one side of a flow
+ * @c: Execution context
+ * @proto: Protocol of this flow (IP L4 protocol number)
+ * @fside: Flowside
+ *
+ * Return: hash value
+ */
+uint64_t flow_hash(const struct ctx *c, uint8_t proto,
+ const struct flowside *fside)
+{
+ struct siphash_state state = SIPHASH_INIT(c->hash_secret);
+
+ ASSERT(flowside_complete(fside));
+ inany_siphash_feed(&state, &fside->faddr);
+ inany_siphash_feed(&state, &fside->eaddr);
Customary newline here.
Done.
+ return
siphash_final(&state, 38, (uint64_t)proto << 40 |
+ (uint64_t)fside->pif << 32 |
+ fside->fport << 16 | fside->eport);
If we add the fields from the 'tail' part (not the whole fside) to an
anonymous struct in a similar way to what we had before fc8f0f8c48ef
("siphash: Use incremental rather than all-at-once siphash functions"),
then we could drop those shift and masks, and use sizeof(that) +
sizeof(fside->faddr) + sizeof(fside->eaddr) instead of '38'.
Hrm, I guess so. There are some wrinkles:
* we'd need a union so we can get the actual u64 value we need to
pass
I haven't tried, but I guess you can just cast a (packed) struct.
* we'd need to make sure that the the remaining
(64-38 == 26) bytes
of that union are consistently initialised
Where do the 64 _bytes_ come from?
* the struct part would need to be packed, or
padding will mess with
us
Right, yes, just see tcp_seq_init() before fc8f0f8c48ef.
* the exact value would now depend on the host
endianness, which is
probably fine, but worth noting
Oh, I didn't even think of that when I wrote the old tcp_seq_init().
Anyway, yes, it doesn't matter.
+}
+
/**
* flow_defer_handler() - Handler for per-flow deferred and timed tasks
* @c: Execution context
diff --git a/flow.h b/flow.h
index e7c4484..72ded54 100644
--- a/flow.h
+++ b/flow.h
@@ -81,6 +81,22 @@ static inline bool flowside_complete(const struct flowside *fside)
#define SIDES 2
+/**
+ * flowside_eq() - Check if two flowsides are equal
...this raises the question: if the protocol number is now used in the
hash, shouldn't it eventually become part of struct flowside -- and
compared here, too?
I guess it's useful iff we allow flowsides
for the same flow to have
different protocol numbers.
Right, which is not something I'm planning on doing, or at least not
very soon.
Now, forwarding between TCP and UDP endpoints
might not make a lot of
sense, because we would have to make so many arbitrary assumptions as
to make it probably not generic enough to be useful.
Exactly.
But TCP to stream-oriented UNIX domain socket
makes sense, and we also
had user requests in that sense. Oh and... what would be the second
protocol number in that case?
Right, that's a possible future extension. But if we're going outside
the realm of IP, a number of things need to be changed. I think
that's something for another day.
I'm not suggesting to support that right away, I was just wondering if
it actually makes sense, right from the beginning, to keep the hash
information "consistent" with the flow table information, including
having the protocol number in the flow table.
But I didn't really think it through, you know better.
--
Stefano