On Thu, Sep 19, 2024 at 03:51:43PM +0200, Stefano Brivio wrote:Sorry for the delay, I wanted first to finish extending tests to run also functional ones (not just throughput and latency) with vhost-user, but it's taking me a bit longer than expected, so here comes the review. By the way, by mistake I let passt run in non-vhost-user mode while QEMU was configured to use it. This results in a loop: qemu-system-x86_64: -netdev vhost-user,id=netdev0,chardev=chr0: Failed to read msg header. Read 0 instead of 12. Original request 1. qemu-system-x86_64: -netdev vhost-user,id=netdev0,chardev=chr0: vhost_backend_init failed: Protocol error qemu-system-x86_64: -netdev vhost-user,id=netdev0,chardev=chr0: failed to init vhost_net for queue 0 qemu-system-x86_64: -netdev vhost-user,id=netdev0,chardev=chr0: Failed to read msg header. Read 0 instead of 12. Original request 1. qemu-system-x86_64: -netdev vhost-user,id=netdev0,chardev=chr0: vhost_backend_init failed: Protocol error qemu-system-x86_64: -netdev vhost-user,id=netdev0,chardev=chr0: failed to init vhost_net for queue 0 ... and passt says: accepted connection from PID 4807 Bad frame size from guest, resetting connection Client connection closed accepted connection from PID 4807 Bad frame size from guest, resetting connection Client connection closed ... while happily flooding system logs. I guess it should be fixed in QEMU at some point: if the vhost_net initialisation fails, I don't see the point in retrying. This is without the "reconnect" option by the way:Fwiw, I tend to agree. [snip]If we introduced this as deprecated, then we should also introduce --no-vhost-user immediately (as a no-op). So that people can make future proof scripts that _don't_ want VU right away. Or... Maybe we should start deprecating in terms of a a more general option, say: --backend qemu-stream --backend tuntap --backend vhost-user So we don't need to proliferate even more options, when say we want to add: --backend vduse Or whatever.+.TP +.BR \-\-vhost-userI think we should introduce this option as deprecated right away, so that we can switch to vhost-user mode by default soon (checking if the hypervisor sends us a vhost-user command) without having to keep this option around. At that point, we can add --no-vhost-user instead.If it makes sense, you could copy the text from --stderr: Note that this configuration option is \fBdeprecated\fR and will be removed in a future version.Yeah, I don't think having different epoll types would be particularly helpful here. If we have multiple guests we won't necessarily know which one we'll be forwarding to until we've been through the "forward / NAT" logic.+Enable vhost-user. The vhost-user command socket is provided by \fB--socket\fR. + +.TP +.BR \-\-print-capabilities +Print back-end capabilities in JSON format, only meaningful for vhost-user mode. + .TP .BR \-F ", " \-\-fd " " \fIFD Pass a pre-opened, connected socket to \fBpasst\fR. Usually the socket is opened diff --git a/passt.c b/passt.c index ad6f0bc32df6..b64efeaf346c 100644 --- a/passt.c +++ b/passt.c @@ -74,6 +74,8 @@ char *epoll_type_str[] = { [EPOLL_TYPE_TAP_PASTA] = "/dev/net/tun device", [EPOLL_TYPE_TAP_PASST] = "connected qemu socket", [EPOLL_TYPE_TAP_LISTEN] = "listening qemu socket", + [EPOLL_TYPE_VHOST_CMD] = "vhost-user command socket", + [EPOLL_TYPE_VHOST_KICK] = "vhost-user kick socket", }; static_assert(ARRAY_SIZE(epoll_type_str) == EPOLL_NUM_TYPES, "epoll_type_str[] doesn't match enum epoll_type"); @@ -206,6 +208,7 @@ int main(int argc, char **argv) struct rlimit limit; struct timespec now; struct sigaction sa; + struct vu_dev vdev; clock_gettime(CLOCK_MONOTONIC, &log_start); @@ -262,6 +265,8 @@ int main(int argc, char **argv) pasta_netns_quit_init(&c); tap_sock_init(&c); + if (c.mode == MODE_VU) + vu_init(&c, &vdev); secret_init(&c); @@ -352,14 +357,31 @@ loop: tcp_timer_handler(&c, ref); break; case EPOLL_TYPE_UDP_LISTEN: - udp_listen_sock_handler(&c, ref, eventmask, &now); + if (c.mode == MODE_VU) {Eventually, we'll probably want to make passt more generic and to support multiple guests, so at that point this might become EPOLL_TYPE_UDP_VU_LISTEN if it's a socket we opened for a guest using vhost-user. Or maybe we'll have to unify the receive paths, so this will remain EPOLL_TYPE_UDP_LISTEN.Either way, _if it's more convenient for you right now_, I wouldn't see any issue in defining new EPOLL_TYPE_UDP_VU_{LISTEN,REPLY} values. > + udp_vu_listen_sock_handler(&c, ref, eventmask, > + &now); > + } else { > + udp_buf_listen_sock_handler(&c, ref, eventmask, > + &now); > + } > break; > case EPOLL_TYPE_UDP_REPLY:It could potentially be used for this case, though, since this is associated with a single flow.> - udp_reply_sock_handler(&c, ref, eventmask, &now); > + if (c.mode == MODE_VU) > + udp_vu_reply_sock_handler(&c, ref, eventmask, > + &now); > + else > + udp_buf_reply_sock_handler(&c, ref, eventmask, > + &now); > break;[snip]I'm guessing this was meant to go on the next function down. This might be a content conflict with one of my recent changes: the "TCP cleanups" added "const" to a lot of context pointers through the TCP code, I'm guessing this function was calling one of them, so couldn't be const when Laurent first wrote it.+/** + * tcp_vu_pcap() - Capture a single frame to pcap file (TCP) + * @c: Execution context + * @tapside: Address information for one side of the flow + * @iov: Pointer to the array of IO vectors + * @iov_used: Length of the array + * @l4len: IPv4 Payload length + */ +static void tcp_vu_pcap(const struct ctx *c, const struct flowside *tapside,'c' should be const (unless you modify data pointed by it, but I don't see where), otherwise gcc complains:tcp.c: In function ‘tcp_send_flag’: tcp.c:1249:41: warning: passing argument 1 of ‘tcp_vu_send_flag’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 1249 | return tcp_vu_send_flag(c, conn, flags); | ^ In file included from tcp.c:307: tcp_vu.h:9:34: note: expected ‘struct ctx *’ but argument is of type ‘const struct ctx *’ 9 | int tcp_vu_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags); | ~~~~~~~~~~~~^-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson