On Thu, Jul 24, 2025 at 2:25 AM David Gibson
On Wed, Jul 09, 2025 at 07:47:40PM +0200, Eugenio Pérez wrote:
vhost kernel expects this as the first data of the frame.
Signed-off-by: Eugenio Pérez
--- tap.c | 2 +- tcp_buf.c | 15 +++++++++++++-- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/tap.c b/tap.c index 0c49e6d..0656294 100644 --- a/tap.c +++ b/tap.c @@ -593,7 +593,7 @@ size_t tap_send_frames(const struct ctx *c, const struct iovec *iov, nframes - m, nframes);
pcap_multiple(iov, bufs_per_frame, m, - c->mode == MODE_PASST ? sizeof(uint32_t) : 0); + c->mode == MODE_PASST ? sizeof(uint32_t) : sizeof(struct virtio_net_hdr_mrg_rxbuf));
If I understand correctly, you're always considering virtio_net_hdr_mrg_rxbuf part of the extended frame in PASTA mode, it's just unused when you're not using the vhost path. Is that correct?
We probably want a helper that returns the correct tap header length, a companion to tap_hdr_iov().
Yes, the plan was to add all of these conditionals in a later version but I failed to document it properly :).
return m; } diff --git a/tcp_buf.c b/tcp_buf.c index d1fca67..2fbd056 100644 --- a/tcp_buf.c +++ b/tcp_buf.c @@ -22,6 +22,8 @@
#include
+#include
+ #include "util.h" #include "ip.h" #include "iov.h" @@ -43,7 +45,7 @@ static struct ethhdr tcp4_eth_src; static struct ethhdr tcp6_eth_src; -static struct tap_hdr tcp_payload_tap_hdr[TCP_FRAMES_MEM]; +static struct virtio_net_hdr_mrg_rxbuf tcp_payload_tap_hdr[TCP_FRAMES_MEM];
The intention is that struct tap_hdr can store whatever sort of "below L2" header we need - it's just that so far we only had the trivial qemu socket header or nothing. Now we're adding another option, so it should be a union branch of tap_hdr, rather than a separate structure.
Right.
/* IP headers for IPv4 and IPv6 */ struct iphdr tcp4_payload_ip[TCP_FRAMES_MEM]; @@ -75,6 +77,14 @@ void tcp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s) eth_update_mac(&tcp6_eth_src, eth_d, eth_s); }
+static inline struct iovec virtio_net_hdr_iov(struct virtio_net_hdr_mrg_rxbuf *hdr) +{ + return (struct iovec){ + .iov_base = hdr, + .iov_len = sizeof(*hdr), + }; +}
With that unification of tap_hdr_iov(), this should be a branch in tap_hdr_iov(), rather than a separate function. tap_hdr_update() will also need to be updated to only update vnet_len in PASST (qemu socket) mode.
It is now occurring to me that adding vhost is highlighting an existing ambiguity in our terminology: "tap" can mean either "the guest facing interface, of whatever mechanics", or specifically the kernel tuntap device. Maybe we need to look at a great renaming.
+ /** * tcp_sock_iov_init() - Initialise scatter-gather L2 buffers for IPv4 sockets * @c: Execution context @@ -85,6 +95,7 @@ void tcp_sock_iov_init(const struct ctx *c) struct iphdr iph = L2_BUF_IP4_INIT(IPPROTO_TCP); int i;
+ (void)c; tcp6_eth_src.h_proto = htons_constant(ETH_P_IPV6); tcp4_eth_src.h_proto = htons_constant(ETH_P_IP);
@@ -96,7 +107,7 @@ void tcp_sock_iov_init(const struct ctx *c) for (i = 0; i < TCP_FRAMES_MEM; i++) { struct iovec *iov = tcp_l2_iov[i];
- iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp_payload_tap_hdr[i]); + iov[TCP_IOV_TAP] = virtio_net_hdr_iov(&tcp_payload_tap_hdr[i]); iov[TCP_IOV_ETH].iov_len = sizeof(struct ethhdr); iov[TCP_IOV_PAYLOAD].iov_base = &tcp_payload[i]; }
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson