On 24/07/2025 15:01, Laurent Vivier wrote:
On 18/07/2025 20:45, Stefano Brivio wrote:
On Mon, 23 Jun 2025 13:06:04 +0200 Laurent Vivier
wrote: This series introduces iov_tail to convey frame information between functions.
This is only an API change, for the moment the memory pool is only able to store contiguous buffer, so, except for vhost-user in a special case, we only play with iovec array with only one entry.
v7: - Add a patch to fix comment style of 'Return:' - Fix ignore_arp()/accept_arp() - Fix coverity error - Fix several comments
I was about to apply this without 1/31 (I applied the v2 of it you sent outside of this series instead, which is actually up to date) and with the minor comment fix to 31/31... but the test perf/passt_vu_tcp fails rather consistently now (and I triple checked without this series):
- "TCP throughput over IPv6: guest to host" with MTU 1500 and 9000 bytes now reports between 0 and 0.6 Gbps. The guest kernel prints a series of two messages with ~1-10 µs interval:
[ 21.159827] TCP: out of memory -- consider tuning tcp_mem [ 21.159831] TCP: out of memory -- consider tuning tcp_mem
- "TCP throughput over IPv4: guest to host" never reports 0 Gbps, but the throughput figure for large MTU (65520 bytes) is very low (5.4 Gbps in the last run). Here I'm getting four messages:
[ 40.807818] TCP: out of memory -- consider tuning tcp_mem [ 40.807829] TCP: out of memory -- consider tuning tcp_mem [ 40.807829] TCP: out of memory -- consider tuning tcp_mem [ 40.807830] TCP: out of memory -- consider tuning tcp_mem
- on the reverse direction, "TCP throughput over IPv4: host to guest" (but not with IPv6), the iperf3 client gets SIGSEGV, but not consistently, it happened once out of five times.
To me it smells a bit like we're leaking virtqueue slots but I looked again at the whole series and I couldn't find anything obvious... at least not yet.
UDP tests never fail and the throughput is the same as before.
I think the problem is the way we use the iovec array.
In tap4_handler() we have a packet_get() that provides a pointer to the iovec array from pool. Idx is 0, iovec idx is 0.
Then we have a pool_flush(), so first available idx is now 0 again.
And then we have packet_add() with the iovec idx (in "data") of the previous packet_get() that we try to add at the same index (as pool is empty again, and first available idx is 0).
When I wrote this patch I guessed that when we release packet (pool_flush()) we don't use anymore the iovec array of the packet, it appears to be not true.
Could you try the following patch (my iperf3 continue to crash <defunct> on my host system, not related to this problem), it's a little bit ugly (use of alloca()) but it's an easy fix. diff --git a/packet.c b/packet.c index 8dbe00af12c6..6d187192dd3a 100644 --- a/packet.c +++ b/packet.c @@ -155,18 +155,23 @@ static size_t packet_iov_next_idx(const struct pool *p, size_t idx, * @func: For tracing: name of calling function * @line: For tracing: caller line of function call */ -static void packet_iov_data(const struct pool *p, size_t idx, - struct iov_tail *data, - const char *func, int line) +static size_t packet_iov_data(const struct pool *p, size_t idx, + struct iovec *iov, size_t cnt, + const char *func, int line) { - struct iovec *iov = (struct iovec *)p->buf; - size_t iov_idx, iov_cnt; + const struct iovec *src_iov = (struct iovec *)p->buf; + size_t src_idx, src_cnt; + size_t i; - iov_idx = packet_iov_idx(p, idx, &iov_cnt, func, line); + src_idx = packet_iov_idx(p, idx, &src_cnt, func, line); - data->iov = &iov[iov_idx]; - data->cnt = iov_cnt; - data->off = 0; + if (cnt < src_cnt) + return 0; + + for (i = 0; i < src_cnt; i++) + iov[i] = src_iov[src_idx + i]; + + return i; } /** @@ -268,6 +273,7 @@ void packet_add_do(struct pool *p, struct iov_tail *data, */ bool packet_get_do(const struct pool *p, size_t idx, struct iov_tail *data, + struct iovec *iov, size_t cnt, const char *func, int line) { ASSERT_WITH_MSG(p->count <= p->size, @@ -281,12 +287,13 @@ bool packet_get_do(const struct pool *p, size_t idx, } if (p->memory) { - packet_iov_data(p, idx, data, func, line); + data->cnt = packet_iov_data(p, idx, iov, cnt, func, line); + data->iov = iov; } else { data->cnt = 1; - data->off = 0; data->iov = &p->pkt[idx]; } + data->off = 0; ASSERT_WITH_MSG(!packet_iov_check_range(p, data, func, line), "Corrupt packet pool, %s:%i", func, line); diff --git a/packet.h b/packet.h index 45843a6775ab..f1e28b3f31d3 100644 --- a/packet.h +++ b/packet.h @@ -35,14 +35,16 @@ int vu_packet_check_range(struct vdev_memory *memory, void packet_add_do(struct pool *p, struct iov_tail *data, const char *func, int line); bool packet_get_do(const struct pool *p, const size_t idx, - struct iov_tail *data, const char *func, int line); + struct iov_tail *data, + struct iovec *iov, size_t cnt, + const char *func, int line); bool pool_full(const struct pool *p); void pool_flush(struct pool *p); #define packet_add(p, data) \ packet_add_do(p, data, __func__, __LINE__) #define packet_get(p, idx, data) \ - packet_get_do(p, idx, data, __func__, __LINE__) + packet_get_do(p, idx, data, (struct iovec *)alloca(sizeof(struct iovec) * 10), 10, __func__, __LINE__) #define PACKET_POOL_DECL(_name, _size) \ struct _name ## _t { \