On 4/3/26 08:20, Stefano Brivio wrote:
On Wed, 1 Apr 2026 21:18:26 +0200 Laurent Vivier
wrote: The previous per-protocol padding done by vu_pad() in tcp_vu.c and udp_vu.c was only correct for single-buffer frames: it assumed the padding area always fell within the first iov, writing past its end with a plain memset().
It also required each caller to compute MAX(..., ETH_ZLEN + VNET_HLEN) for vu_collect() and to call vu_pad() at the right point, duplicating the minimum-size logic across protocols.
Move the Ethernet minimum size enforcement into vu_collect() itself, so that enough buffer space is always reserved for padding regardless of the requested frame size.
Rewrite vu_pad() to take a full iovec array and use iov_memset(), making it safe for multi-buffer (mergeable rx buffer) frames.
In tcp_vu_sock_recv(), replace iov_truncate() with iov_skip_bytes(): now that all consumers receive explicit data lengths, truncating the iovecs is no longer needed. In tcp_vu_data_from_sock(), cap each frame's data length against the remaining bytes actually received from the socket, so that the last partial frame gets correct headers and sequence number advancement.
Signed-off-by: Laurent Vivier
--- iov.c | 1 - tcp_vu.c | 29 ++++++++++++++--------------- udp_vu.c | 14 ++++++++------ vu_common.c | 32 +++++++++++++++----------------- vu_common.h | 2 +- 5 files changed, 38 insertions(+), 40 deletions(-) diff --git a/iov.c b/iov.c index 83b683f3976a..2289b425529e 100644 --- a/iov.c +++ b/iov.c @@ -180,7 +180,6 @@ size_t iov_truncate(struct iovec *iov, size_t iov_cnt, size_t size) * Will write less than @length bytes if it runs out of space in * the iov */ -/* cppcheck-suppress unusedFunction */ void iov_memset(const struct iovec *iov, size_t iov_cnt, size_t offset, int c, size_t length) { diff --git a/tcp_vu.c b/tcp_vu.c index ae79a6d856b0..cae6926334b9 100644 --- a/tcp_vu.c +++ b/tcp_vu.c @@ -72,12 +72,12 @@ int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags) struct vu_dev *vdev = c->vdev; struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; struct vu_virtq_element flags_elem[2]; - size_t optlen, hdrlen, l2len; struct ipv6hdr *ip6h = NULL; struct iphdr *ip4h = NULL; struct iovec flags_iov[2]; struct tcp_syn_opts *opts; struct iov_tail payload; + size_t optlen, hdrlen; struct tcphdr *th; struct ethhdr *eh; uint32_t seq; @@ -88,7 +88,7 @@ int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags)
elem_cnt = vu_collect(vdev, vq, &flags_elem[0], 1, &flags_iov[0], 1, NULL, - MAX(hdrlen + sizeof(*opts), ETH_ZLEN + VNET_HLEN), NULL); + hdrlen + sizeof(*opts), NULL); if (elem_cnt != 1) return -1;
@@ -128,7 +128,6 @@ int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags) return ret; }
- iov_truncate(&flags_iov[0], 1, hdrlen + optlen); payload = IOV_TAIL(flags_elem[0].in_sg, 1, hdrlen);
if (flags & KEEPALIVE) @@ -137,9 +136,7 @@ int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags) tcp_fill_headers(c, conn, eh, ip4h, ip6h, th, &payload, optlen, NULL, seq, !*c->pcap);
- l2len = optlen + hdrlen - VNET_HLEN; - vu_pad(&flags_elem[0].in_sg[0], l2len); - + vu_pad(flags_elem[0].in_sg, 1, hdrlen + optlen); vu_flush(vdev, vq, flags_elem, 1, hdrlen + optlen);
if (*c->pcap) @@ -149,7 +146,7 @@ int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags) if (flags & DUP_ACK) { elem_cnt = vu_collect(vdev, vq, &flags_elem[1], 1, &flags_iov[1], 1, NULL, - flags_elem[0].in_sg[0].iov_len, NULL); + hdrlen + optlen, NULL); if (elem_cnt == 1 && flags_elem[1].in_sg[0].iov_len >= flags_elem[0].in_sg[0].iov_len) { @@ -213,7 +210,7 @@ static ssize_t tcp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, ARRAY_SIZE(elem) - elem_cnt, &iov_vu[DISCARD_IOV_NUM + iov_used], VIRTQUEUE_MAX_SIZE - iov_used, &in_total, - MAX(MIN(mss, fillsize) + hdrlen, ETH_ZLEN + VNET_HLEN), + MIN(mss, fillsize) + hdrlen, &frame_size); if (cnt == 0) break; @@ -249,8 +246,11 @@ static ssize_t tcp_vu_sock_recv(const struct ctx *c, struct vu_virtq *vq, if (!peek_offset_cap) ret -= already_sent;
- /* adjust iov number and length of the last iov */ - i = iov_truncate(&iov_vu[DISCARD_IOV_NUM], iov_used, ret); + i = iov_skip_bytes(&iov_vu[DISCARD_IOV_NUM], iov_used, + MAX(hdrlen + ret, VNET_HLEN + ETH_ZLEN), + NULL);
Nit: this should be aligned like this:
i = iov_skip_bytes(&iov_vu[DISCARD_IOV_NUM], iov_used, MAX(hdrlen + ret, VNET_HLEN + ETH_ZLEN), NULL);
+ if ((size_t)i < iov_used) + i++;
I'm a bit lost here. I see that this increment restores the iov_truncate() convention of returning the number of iov items (which
iov_truncate() was truncating the iovec array (reducing the cnt and iov_len of the last iovec) to fit the actual size of the data. Here we are counting the number of elements: we have collected more elements than needed to store the data, so we need to know how many we use to give back the unused to the virtio-queue. Again, the confusing point is that we have the same number of elements as the number of iovec. It's fixed in the following series.
we need later), but... what happens if we have i >= iov_used (even though my assumption is that it should never happen)? We're throwing away data? i cannot be greater than iov_used. if i == iov_used, it means we need all the elements.
Thanks, Laurent