[PATCH v24 1/5] vhost_user: Clear ring address on GET_VRING_BASE

14 Feb 2025

GET_VRING_BASE stops the queue, clearing the call and kick fds.  However,
we don't clear vring.avail.  That means that if vu_queue_notify() is called
it won't realise the queue isn't ready and will die with an EBADFD.

We get this during migration, because for some reason, qemu reconfigures
the vhost-user device when a migration is triggered.  There's a window
between the GET_VRING_BASE and re-establishing the call fd where the
notify function can be called, causing a crash.

Signed-off-by: David Gibson 
---
 vhost_user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/vhost_user.c b/vhost_user.c
index 7ab13774..be1aa942 100644
--- a/vhost_user.c
+++ b/vhost_user.c
@@ -732,6 +732,7 @@ static bool vu_get_vring_base_exec(struct vu_dev *vdev,
 	msg->hdr.size = sizeof(msg->payload.state);
 
 	vdev->vq[idx].started = false;
+	vdev->vq[idx].vring.avail = 0;
 
 	if (vdev->vq[idx].call_fd != -1) {
 		close(vdev->vq[idx].call_fd);
-- 
2.48.1