On Tue, Aug 05, 2025 at 05:46:28PM +0200, Laurent Vivier wrote:
The packet pool was previously limited to handling packets contained within a single buffer.
This patch extends the packet pool to support iovec array, allowing a single logical packet to be composed of multiple iovec.
To accommodate this, the storage format within the pool is modified. For a multi-vector packet, a header entry is now stored first with iov_base = NULL and iov_len holding the number of subsequent vectors. The actual data vectors are then stored in the following pool slots.
The packet_add_do() and packet_get_do() functions are updated to manage this new format for storing and retrieving packets. The pool_full() check is also adjusted to ensure there is enough space for all vectors of a new packet before adding it.
Signed-off-by: Laurent Vivier
--- packet.c | 50 +++++++++++++++++++++++++++++++++----------------- packet.h | 2 +- tap.c | 4 ++-- 3 files changed, 36 insertions(+), 20 deletions(-) diff --git a/packet.c b/packet.c index 4b93688509a4..d697232d951a 100644 --- a/packet.c +++ b/packet.c @@ -90,12 +90,13 @@ static int packet_check_range(const struct pool *p, const char *ptr, size_t len, /** * pool_full() - Is a packet pool full? * @p: Pointer to packet pool + * @data: check data can fit in the pool * - * Return: true if the pool is full, false if more packets can be added + * Return: true if the pool is full, false if data can be added */ -bool pool_full(const struct pool *p) +bool pool_full(const struct pool *p, const struct iov_tail *data)
Given the slightly changed semantics, I wonder if 'pool_can_fit()' might be a better name now.
{ - return p->count >= p->size; + return p->count + data->cnt + (data->cnt > 1) >= p->size;
This test is only correct if data is already pruned. As I've said elsewhere, it might be worth changing to the assumption that iov_tails are pruned everywhere outside the iov_tail internal handling. Oh.. also I think the new check is off by one (in the relatively safe direction). It will say there's no room when there is just exactly enough room.
}
/** @@ -108,11 +109,9 @@ bool pool_full(const struct pool *p) void packet_add_do(struct pool *p, struct iov_tail *data, const char *func, int line) { - size_t idx = p->count; - const char *start; - size_t len; + size_t idx = p->count, i, offset;
- if (pool_full(p)) { + if (pool_full(p, data)) { debug("add packet index %zu to pool with size %zu, %s:%i", idx, p->size, func, line); return; @@ -121,18 +120,30 @@ void packet_add_do(struct pool *p, struct iov_tail *data, if (!iov_tail_prune(data)) return;
- ASSERT(data->cnt == 1); /* we don't support iovec */ + if (data->cnt > 1) { + p->pkt[idx].iov_base = NULL; + p->pkt[idx].iov_len = data->cnt; + idx++; + }
- len = data->iov[0].iov_len - data->off; - start = (char *)data->iov[0].iov_base + data->off; + offset = data->off; + for (i = 0; i < data->cnt; i++) { + const char *start; + size_t len;
- if (packet_check_range(p, start, len, func, line)) - return; + len = data->iov[i].iov_len - offset; + start = (char *)data->iov[i].iov_base + offset; + offset = 0;
- p->pkt[idx].iov_base = (void *)start; - p->pkt[idx].iov_len = len; + if (packet_check_range(p, start, len, func, line)) + return;
- p->count++; + p->pkt[idx].iov_base = (void *)start; + p->pkt[idx].iov_len = len; + idx++;
Hm. Isn't the above equivalent to iov_tail_clone()? Is calling packet_check_range() on each chunk the only reason for open-coding it here?
+ } + + p->count = idx; }
/** @@ -162,9 +173,14 @@ bool packet_get_do(const struct pool *p, size_t idx, return false; }
- data->cnt = 1; + if (p->pkt[idx].iov_base) { + data->cnt = 1; + data->iov = &p->pkt[idx]; + } else { + data->cnt = p->pkt[idx].iov_len; + data->iov = &p->pkt[idx + 1]; + } data->off = 0; - data->iov = &p->pkt[idx];
for (i = 0; i < data->cnt; i++) { ASSERT_WITH_MSG(!packet_check_range(p, data->iov[i].iov_base, diff --git a/packet.h b/packet.h index e51cbd19fdc4..67dc7deb17db 100644 --- a/packet.h +++ b/packet.h @@ -37,7 +37,7 @@ void packet_add_do(struct pool *p, struct iov_tail *data, const char *func, int line); bool packet_get_do(const struct pool *p, const size_t idx, struct iov_tail *data, const char *func, int line); -bool pool_full(const struct pool *p); +bool pool_full(const struct pool *p, const struct iov_tail *data); void pool_flush(struct pool *p);
#define packet_add(p, data) \ diff --git a/tap.c b/tap.c index 9fd00915bb01..95688b22fcb7 100644 --- a/tap.c +++ b/tap.c @@ -1103,14 +1103,14 @@ void tap_add_packet(struct ctx *c, struct iov_tail *data, switch (ntohs(eh->h_proto)) { case ETH_P_ARP: case ETH_P_IP: - if (pool_full(pool_tap4)) { + if (pool_full(pool_tap4, data)) { tap4_handler(c, pool_tap4, now); pool_flush(pool_tap4); } packet_add(pool_tap4, data); break; case ETH_P_IPV6: - if (pool_full(pool_tap6)) { + if (pool_full(pool_tap6, data)) { tap6_handler(c, pool_tap6, now); pool_flush(pool_tap6); }
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson