GET_VRING_BASE stops the queue, clearing the call and kick fds. However, we don't clear vring.avail. That means that if vu_queue_notify() is called it won't realise the queue isn't ready and will die with an EBADFD. We get this during migration, because for some reason, qemu reconfigures the vhost-user device when a migration is triggered. There's a window between the GET_VRING_BASE and re-establishing the call fd where the notify function can be called, causing a crash. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- vhost_user.c | 1 + 1 file changed, 1 insertion(+) diff --git a/vhost_user.c b/vhost_user.c index 7ab13774..be1aa942 100644 --- a/vhost_user.c +++ b/vhost_user.c @@ -732,6 +732,7 @@ static bool vu_get_vring_base_exec(struct vu_dev *vdev, msg->hdr.size = sizeof(msg->payload.state); vdev->vq[idx].started = false; + vdev->vq[idx].vring.avail = 0; if (vdev->vq[idx].call_fd != -1) { close(vdev->vq[idx].call_fd); -- 2.48.1