On 4/3/26 18:59, Stefano Brivio wrote:
On Fri, 3 Apr 2026 17:18:23 +0200 Laurent Vivier
wrote: On 4/3/26 13:53, Stefano Brivio wrote:
On Wed, 1 Apr 2026 21:23:24 +0200 Laurent Vivier
wrote: The previous code assumed a 1:1 mapping between virtqueue elements and iovec entries (enforced by an assert). Drop that assumption to allow elements that span multiple iovecs: track elem_used separately by walking the element list against the iov count returned after padding. This also fixes vu_queue_rewind() and vu_flush() to use the element count rather than the iov count.
Use iov_tail_clone() in udp_vu_sock_recv() to handle header offset, replacing the manual base/len adjustment and restore pattern.
Signed-off-by: Laurent Vivier
--- udp_vu.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/udp_vu.c b/udp_vu.c index 30af64034516..5608a3a96ff5 100644 --- a/udp_vu.c +++ b/udp_vu.c @@ -64,30 +64,25 @@ static size_t udp_vu_hdrlen(bool v6) */ static ssize_t udp_vu_sock_recv(struct iovec *iov, size_t *cnt, int s, bool v6) { + struct iovec msg_iov[*cnt];
Variable-length Arrays (VLAs) are allowed starting from C99 but we should really really avoid them. If 'cnt' is big enough, we risk writing all over the place. That's the main reason why they were more or less banned from the Linux kernel some years ago and eventually eradicated:
I can use alloca() if you prefer ;)
Claude, is this you? ;)
I guess if you come up with a sufficient convoluted maze of "elem" / "iov" / "head" macros using concatenation to strategically place calls to strndupa(), with an abstraction based on IOV "tails" called "strain" indicated by "strn" for brevity... one day I might miss that, yes.
But I'll try to remember that, the next time we discuss whether it's really needed to duplicate strain A or if a single copy of it is enough.
https://lore.kernel.org/lkml/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1R...
https://lore.kernel.org/lkml/20181028172401.GA41102@beast/
Can we use VIRTQUEUE_MAX_SIZE as upper bound like udp_vu_sock_to_tap() does?
Yes, but the idea here is we have always *cnt < VIRTQUEUE_MAX_SIZE (because the value comes from vu_collect() and in vu_collect(): *in_sg (&iov_cnt or *cnt) < max_in_sg (ARRAY_SIZE(iov_vu) or VIRTQUEUE_MAX_SIZE or 1024)
...until somebody, running this somewhere where we don't have gcc's stack protector stuff (or equivalent), without having quite obtained full arbitrary code execution yet, finds a way to manipulate *cnt...
And vu_collect(), in this case, sets generally *in_sg to a value lower than 44: we want to create a frame of ETH_MAX_MTU by coalescing kernel buffers of size ETH_FRAME_LEN)
If it _can_ be 16 KiB, then I would suggest it's better to _just_ have 16 KiB. It's more auditable, and it's not like we "allocate" it anyway.
On top of it, udp_vu_sock_to_tap() already does that, and other functions (with potentially deeper call trees) do worse. I'm not claiming it's a good idea to do it "because it's bad anyway", in general, but in this case the maximum is what matters.
For me 16 kB on the stack is a lot of memory (but I started programming on a 48 kB RAM computer...).
I guess I started with around ten times that but I tend to agree. What I'm suggesting is that if it can be a lot, better just make it that lot.
An alternative I'm pondering about is whether we can make things recoverably / gracefully fail if that's > 64 or something like that. At this point we still have that data on the socket and we could dequeue it in a later pass I suppose. But maybe it gets very complicated...
I understand. I already sent a v6 without this change but for the v7 I will remove the VLA (without using alloca()). Thanks, Laurent