On 10/09/2024 17:47, Stefano Brivio wrote:
Nits and a couple of questions only:
On Fri, 6 Sep 2024 18:04:48 +0200 Laurent Vivier
wrote: Add vhost_user.c and vhost_user.h that define the functions needed to implement vhost-user backend.
Signed-off-by: Laurent Vivier
--- Makefile | 4 +- iov.c | 1 - vhost_user.c | 1265 ++++++++++++++++++++++++++++++++++++++++++++++++++ vhost_user.h | 203 ++++++++ virtio.c | 5 - virtio.h | 2 +- 6 files changed, 1471 insertions(+), 9 deletions(-) create mode 100644 vhost_user.c create mode 100644 vhost_user.h ... diff --git a/vhost_user.c b/vhost_user.c new file mode 100644 index 000000000000..6008a8adc967 --- /dev/null +++ b/vhost_user.c ... +/** + * vu_wait_queue() - wait for new free entries in the virtqueue + * @vq: virtqueue to wait on + */ +static int vu_wait_queue(const struct vu_virtq *vq) +{ + eventfd_t kick_data; + ssize_t rc; + int status; + + /* wait for the kernel to put new entries in the queue */ + status = fcntl(vq->kick_fd, F_GETFL); + if (status == -1) + return -1; Same as on v3 (I see you changed this below, but not here): if you don't use status later, you can omit storing it.
We need status with F_SETFL below:
+ + if (fcntl(vq->kick_fd, F_SETFL, status & ~O_NONBLOCK)) + return -1; + + rc = eventfd_read(vq->kick_fd, &kick_data); + + if (fcntl(vq->kick_fd, F_SETFL, status)) + return -1; + + if (rc == -1) + return -1; + + return 0; +}
...
+/** + * vu_handle_tx() - Receive data from the TX virtqueue + * @vdev: vhost-user device + * @index: index of the virtqueue + * @now: Current timestamp + */ +static void vu_handle_tx(struct vu_dev *vdev, int index, + const struct timespec *now) +{ + struct vu_virtq_element elem[VIRTQUEUE_MAX_SIZE]; + struct iovec out_sg[VIRTQUEUE_MAX_SIZE]; + struct vu_virtq *vq = &vdev->vq[index]; + int hdrlen = vdev->hdrlen; + int out_sg_count; + int count; +
Excess newline (same as v3).
Done.
+ + if (!VHOST_USER_IS_QUEUE_TX(index)) { + debug("vhost-user: index %d is not a TX queue", index); + return; + } + + tap_flush_pools(); + + count = 0; + out_sg_count = 0; + while (count < VIRTQUEUE_MAX_SIZE) {
So, I see that this is limited to 1024 iterations now (it was limited also earlier, but I didn't realise that).
If we loop at most VIRTQUEUE_MAX_SIZE times, that means, I guess, that while we're popping elements, the queue can't be written to, correct?
No, I think the queue can be read and write at the same time.
Or it can be written to, but we'll get an additional kick after vu_queue_notify() if that happens?
I could check the protocol and the code, but I think it should work like that.
+ int ret; + + elem[count].out_num = 1; + elem[count].out_sg = &out_sg[out_sg_count]; + elem[count].in_num = 0; + elem[count].in_sg = NULL; + ret = vu_queue_pop(vdev, vq, &elem[count]); + if (ret < 0) + break; + out_sg_count += elem[count].out_num; + + if (elem[count].out_num < 1) { + debug("virtio-net header not in first element"); + break; + } + ASSERT(elem[count].out_num == 1); + + tap_add_packet(vdev->context, + elem[count].out_sg[0].iov_len - hdrlen, + (char *)elem[count].out_sg[0].iov_base + hdrlen); + count++; + } + tap_handler(vdev->context, now); + + if (count) { + int i; + + for (i = 0; i < count; i++) + vu_queue_fill(vq, &elem[i], 0, i); + vu_queue_flush(vq, count); + vu_queue_notify(vdev, vq); + } +} +
...
+/** + * vu_set_vring_err_exec() - Set the event file descriptor to signal when + * error occurs + * @vdev: vhost-user device + * @vmsg: vhost-user message + * + * Return: False as no reply is requested + */ +static bool vu_set_vring_err_exec(struct vu_dev *vdev, + struct vhost_user_msg *msg) +{ + bool nofd = msg->payload.u64 & VHOST_USER_VRING_NOFD_MASK; + int idx = msg->payload.u64 & VHOST_USER_VRING_IDX_MASK; + + debug("u64: 0x%016"PRIx64, msg->payload.u64); + + vu_check_queue_msg_file(msg); + + if (vdev->vq[idx].err_fd != -1) { + close(vdev->vq[idx].err_fd); + vdev->vq[idx].err_fd = -1; + } + + /* cppcheck-suppress redundantAssignment */ + vdev->vq[idx].err_fd = nofd ? -1 : msg->fds[0];
Maybe you missed this comment to v3:
-- Wouldn't it be easier (and not require a suppression) to say:
if (!nofd) vdev->vq[idx].err_fd = msg->fds[0];
Yes, you're right. I thought I fixed that but I think I have overwritten my changes... (I also changed in this way call_fd and kick_fd). ...
+/** + * vu_set_protocol_features_exec() - Enable protocol (vhost-user) features + * @vdev: vhost-user device + * @vmsg: vhost-user message + * + * Return: False as no reply is requested + */ +static bool vu_set_protocol_features_exec(struct vu_dev *vdev, + struct vhost_user_msg *msg) +{ + uint64_t features = msg->payload.u64; + + debug("u64: 0x%016"PRIx64, features); + + vdev->protocol_features = msg->payload.u64; + + if (vu_has_protocol_feature(vdev, + VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS) && + (!vu_has_protocol_feature(vdev, VHOST_USER_PROTOCOL_F_BACKEND_REQ) || + !vu_has_protocol_feature(vdev, VHOST_USER_PROTOCOL_F_REPLY_ACK))) {
Same as v3:
-- Do we actually care about VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS at all, I wonder? This whole part (coming from ff1320050a3a "libvhost-user: implement in-band notifications") is rather hard to read/understand, so it would be great if we could just get rid of it altogether.
But if not, sure, let's leave it like the original, I'd say.
I remove it.
--
+ /* + * The use case for using messages for kick/call is simulation, to make + * the kick and call synchronous. To actually get that behaviour, both + * of the other features are required. + * Theoretically, one could use only kick messages, or do them without + * having F_REPLY_ACK, but too many (possibly pending) messages on the + * socket will eventually cause the master to hang, to avoid this in + * scenarios where not desired enforce that the settings are in a way + * that actually enables the simulation case. + */ + die("F_IN_BAND_NOTIFICATIONS requires F_BACKEND_REQ && F_REPLY_ACK"); + } + + return false; +}
Thanks, Laurent