On Fri, 25 Jul 2025 13:45:36 +0200
Laurent Vivier
On 24/07/2025 15:01, Laurent Vivier wrote:
On 18/07/2025 20:45, Stefano Brivio wrote:
On Mon, 23 Jun 2025 13:06:04 +0200 Laurent Vivier
wrote: This series introduces iov_tail to convey frame information between functions.
This is only an API change, for the moment the memory pool is only able to store contiguous buffer, so, except for vhost-user in a special case, we only play with iovec array with only one entry.
v7: - Add a patch to fix comment style of 'Return:' - Fix ignore_arp()/accept_arp() - Fix coverity error - Fix several comments
I was about to apply this without 1/31 (I applied the v2 of it you sent outside of this series instead, which is actually up to date) and with the minor comment fix to 31/31... but the test perf/passt_vu_tcp fails rather consistently now (and I triple checked without this series):
- "TCP throughput over IPv6: guest to host" with MTU 1500 and 9000 bytes now reports between 0 and 0.6 Gbps. The guest kernel prints a series of two messages with ~1-10 µs interval:
[ 21.159827] TCP: out of memory -- consider tuning tcp_mem [ 21.159831] TCP: out of memory -- consider tuning tcp_mem
- "TCP throughput over IPv4: guest to host" never reports 0 Gbps, but the throughput figure for large MTU (65520 bytes) is very low (5.4 Gbps in the last run). Here I'm getting four messages:
[ 40.807818] TCP: out of memory -- consider tuning tcp_mem [ 40.807829] TCP: out of memory -- consider tuning tcp_mem [ 40.807829] TCP: out of memory -- consider tuning tcp_mem [ 40.807830] TCP: out of memory -- consider tuning tcp_mem
- on the reverse direction, "TCP throughput over IPv4: host to guest" (but not with IPv6), the iperf3 client gets SIGSEGV, but not consistently, it happened once out of five times.
To me it smells a bit like we're leaking virtqueue slots but I looked again at the whole series and I couldn't find anything obvious... at least not yet.
UDP tests never fail and the throughput is the same as before.
I think the problem is the way we use the iovec array.
In tap4_handler() we have a packet_get() that provides a pointer to the iovec array from pool. Idx is 0, iovec idx is 0.
Then we have a pool_flush(), so first available idx is now 0 again.
And then we have packet_add() with the iovec idx (in "data") of the previous packet_get() that we try to add at the same index (as pool is empty again, and first available idx is 0).
When I wrote this patch I guessed that when we release packet (pool_flush()) we don't use anymore the iovec array of the packet, it appears to be not true.
Could you try the following patch (my iperf3 continue to crash <defunct> on my host system, not related to this problem), it's a little bit ugly (use of alloca()) but it's an easy fix.
Weird, I don't get the "TCP: out of memory" messages anymore, but throughput still has some obvious issues: === perf/passt_vu_tcp
passt: throughput and latency Throughput in Gbps, latency in µs, 6 threads at 3.6 GHz MTU: | 256B | 576B | 1280B | 1500B | 9000B | 65520B | |--------|--------|--------|--------|--------|--------| TCP throughput over IPv6: guest to host | - | - | 0.6 | 2.0 | 0 | 13.3 | TCP RR latency over IPv6: guest to host | - | - | - | - | - | 38 | TCP CRR latency over IPv6: guest to host | - | - | - | - | - | 95 | |--------|--------|--------|--------|--------|--------| TCP throughput over IPv4: guest to host | 0 | 2.2 | 1.0 | 1.7 | 9.9 | 16.7 | TCP RR latency over IPv4: guest to host | - | - | - | - | - | 35 | TCP CRR latency over IPv4: guest to host | - | - | - | - | - | 91 | |--------|--------|--------|--------|--------|--------|
...we should never have < 0.1 Gbps, and with 64k MTU, we otherwise have 30 - 40 Gbps. Even more weird, iperf3 now reliably crashes for me too, on the other path (host to guest). It never did without this series. I wonder if the issue is actually "fixed" with this patch but the alloca() is so large as to actually mess with throughput. I can play with that if it helps, or bisect... let me know. I have to admit I didn't fully grasp the problem at hand, simply because I haven't read carefully what you wrote yet. -- Stefano