On Wed, 1 Apr 2026 21:23:24 +0200
Laurent Vivier
The previous code assumed a 1:1 mapping between virtqueue elements and iovec entries (enforced by an assert). Drop that assumption to allow elements that span multiple iovecs: track elem_used separately by walking the element list against the iov count returned after padding. This also fixes vu_queue_rewind() and vu_flush() to use the element count rather than the iov count.
Use iov_tail_clone() in udp_vu_sock_recv() to handle header offset, replacing the manual base/len adjustment and restore pattern.
Signed-off-by: Laurent Vivier
--- udp_vu.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/udp_vu.c b/udp_vu.c index 30af64034516..5608a3a96ff5 100644 --- a/udp_vu.c +++ b/udp_vu.c @@ -64,30 +64,25 @@ static size_t udp_vu_hdrlen(bool v6) */ static ssize_t udp_vu_sock_recv(struct iovec *iov, size_t *cnt, int s, bool v6) { + struct iovec msg_iov[*cnt];
Variable-length Arrays (VLAs) are allowed starting from C99 but we should really really avoid them. If 'cnt' is big enough, we risk writing all over the place. That's the main reason why they were more or less banned from the Linux kernel some years ago and eventually eradicated: https://lore.kernel.org/lkml/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1R... https://lore.kernel.org/lkml/20181028172401.GA41102@beast/ Can we use VIRTQUEUE_MAX_SIZE as upper bound like udp_vu_sock_to_tap() does? We should probably add -Wvla by the way.
struct msghdr msg = { 0 }; + struct iov_tail payload; size_t hdrlen, iov_used; ssize_t dlen;
/* compute L2 header length */ hdrlen = udp_vu_hdrlen(v6);
- /* reserve space for the headers */ - assert(iov[0].iov_len >= MAX(hdrlen, ETH_ZLEN + VNET_HLEN)); - iov[0].iov_base = (char *)iov[0].iov_base + hdrlen; - iov[0].iov_len -= hdrlen; + payload = IOV_TAIL(iov, *cnt, hdrlen);
- /* read data from the socket */ - msg.msg_iov = iov; - msg.msg_iovlen = *cnt; + msg.msg_iov = msg_iov; + msg.msg_iovlen = iov_tail_clone(msg.msg_iov, payload.cnt, &payload);
+ /* read data from the socket */ dlen = recvmsg(s, &msg, 0); if (dlen < 0) return -1;
- /* restore the pointer to the headers address */ - iov[0].iov_base = (char *)iov[0].iov_base - hdrlen; - iov[0].iov_len += hdrlen; - iov_used = iov_skip_bytes(iov, *cnt, MAX(dlen + hdrlen, VNET_HLEN + ETH_ZLEN), NULL); @@ -205,7 +200,7 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s, int n, flow_sidx_t tosidx) }
for (i = 0; i < n; i++) { - unsigned elem_cnt, elem_used; + unsigned elem_cnt, elem_used, j, k; size_t iov_cnt; ssize_t dlen;
@@ -215,15 +210,19 @@ void udp_vu_sock_to_tap(const struct ctx *c, int s, int n, flow_sidx_t tosidx) if (elem_cnt == 0) break;
- assert((size_t)elem_cnt == iov_cnt); /* one iovec per element */ - dlen = udp_vu_sock_recv(iov_vu, &iov_cnt, s, v6); if (dlen < 0) { vu_queue_rewind(vq, elem_cnt); break; }
- elem_used = iov_cnt; /* one iovec per element */ + elem_used = 0; + for (j = 0, k = 0; k < iov_cnt && j < elem_cnt; j++) { + if (k + elem[j].in_num > iov_cnt) + elem[j].in_num = iov_cnt - k;
I think it would be more intuitive to write it like this: size_t iov_still_needed = iov_cnt - k; if (elem[j].in_num > iov_still_needed) elem[j].in_num = iov_still_needed; ...otherwise it looks like 'k + elem[j].in_num' needs to satisfy some kind of bound, but that's not the case.
+ k += elem[j].in_num; + elem_used++; + }
/* release unused buffers */ vu_queue_rewind(vq, elem_cnt - elem_used);
-- Stefano