[PATCH] isolation: keep CAP_DAC_OVERRIDE initially
Reproducer that I'd expect to work
$ cd $HOME
$ sudo passt --runas $UID --socket foo.sock
Failed to bind UNIX domain socket: Permission denied
A more practical example is for libguestfs apps when run as user=root.
+ libguestfs connects to libvirt qemu:///system
+ libvirt qemu:///system defaults to user=qemu.
+ chowns passt runtime dir to user=qemu
+ libguestfs instead requests the VM run as user=root
+ patches in progress but we are blocked by this issue
+ passt is launched as root, but can't open socket in passt dir.
Obviously libvirt needs improvements too.
But it seems like this is a defect as well.
Signed-off-by: Cole Robinson
[Cc: Yumei as this is somewhat related to
https://archives.passt.top/passt-dev/20250926011714.5978-1-yuhuang@redhat.co...,
and David as he wrote most of this part]
On Tue, 7 Oct 2025 08:16:39 -0400
Cole Robinson
Reproducer that I'd expect to work
$ cd $HOME $ sudo passt --runas $UID --socket foo.sock Failed to bind UNIX domain socket: Permission denied
A more practical example is for libguestfs apps when run as user=root.
+ libguestfs connects to libvirt qemu:///system + libvirt qemu:///system defaults to user=qemu. + chowns passt runtime dir to user=qemu + libguestfs instead requests the VM run as user=root + patches in progress but we are blocked by this issue + passt is launched as root, but can't open socket in passt dir.
Obviously libvirt needs improvements too. But it seems like this is a defect as well.
Thanks for the patch! I think it's absolutely unproblematic to keep CAP_DAC_OVERRIDE for a moment at the beginning. Did you figure out exactly why it's needed by the way?
Signed-off-by: Cole Robinson
Should we add: Link: https://github.com/libguestfs/libguestfs/pull/218 ? Or it's misleading, or you omitted it for any other reason?
--- isolation.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/isolation.c b/isolation.c index bbcd23b..b25f349 100644 --- a/isolation.c +++ b/isolation.c @@ -188,6 +188,9 @@ void isolate_initial(int argc, char **argv) * We have to keep CAP_SETUID and CAP_SETGID at this stage, so * that we can switch user away from root. * + * CAP_DAC_OVERRIDE may be required for socket setup when combined + * with --runas. + * * We have to keep some capabilities for the --netns-only case: * - CAP_SYS_ADMIN, so that we can setns() to the netns. * - Keep CAP_NET_ADMIN, so that we can configure interfaces @@ -198,7 +201,7 @@ void isolate_initial(int argc, char **argv) * isolate_prefork(). */ keep = BIT(CAP_NET_BIND_SERVICE) | BIT(CAP_SETUID) | BIT(CAP_SETGID) | - BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN); + BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN) | BIT(CAP_DAC_OVERRIDE);
/* Since Linux 5.12, if we want to update /proc/self/uid_map to create * a mapping from UID 0, which only happens with pasta spawning a child
-- Stefano
On Tue, Oct 07, 2025 at 06:02:32PM +0200, Stefano Brivio wrote:
[Cc: Yumei as this is somewhat related to https://archives.passt.top/passt-dev/20250926011714.5978-1-yuhuang@redhat.co..., and David as he wrote most of this part]
On Tue, 7 Oct 2025 08:16:39 -0400 Cole Robinson
wrote: Reproducer that I'd expect to work
$ cd $HOME $ sudo passt --runas $UID --socket foo.sock Failed to bind UNIX domain socket: Permission denied
A more practical example is for libguestfs apps when run as user=root.
+ libguestfs connects to libvirt qemu:///system + libvirt qemu:///system defaults to user=qemu. + chowns passt runtime dir to user=qemu + libguestfs instead requests the VM run as user=root + patches in progress but we are blocked by this issue + passt is launched as root, but can't open socket in passt dir.
Obviously libvirt needs improvements too. But it seems like this is a defect as well.
Thanks for the patch! I think it's absolutely unproblematic to keep CAP_DAC_OVERRIDE for a moment at the beginning. Did you figure out exactly why it's needed by the way?
It's because the socket directory is chmod to qemu (by libvirt). Without CAP_DAC_OVERRIDE, root can't open/write to a non-root file/directory. Cole explains a bit more at the end of this comment: https://github.com/libguestfs/libguestfs/pull/218#issuecomment-3376943380 Rich.
Signed-off-by: Cole Robinson
Should we add:
Link: https://github.com/libguestfs/libguestfs/pull/218
? Or it's misleading, or you omitted it for any other reason?
--- isolation.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/isolation.c b/isolation.c index bbcd23b..b25f349 100644 --- a/isolation.c +++ b/isolation.c @@ -188,6 +188,9 @@ void isolate_initial(int argc, char **argv) * We have to keep CAP_SETUID and CAP_SETGID at this stage, so * that we can switch user away from root. * + * CAP_DAC_OVERRIDE may be required for socket setup when combined + * with --runas. + * * We have to keep some capabilities for the --netns-only case: * - CAP_SYS_ADMIN, so that we can setns() to the netns. * - Keep CAP_NET_ADMIN, so that we can configure interfaces @@ -198,7 +201,7 @@ void isolate_initial(int argc, char **argv) * isolate_prefork(). */ keep = BIT(CAP_NET_BIND_SERVICE) | BIT(CAP_SETUID) | BIT(CAP_SETGID) | - BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN); + BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN) | BIT(CAP_DAC_OVERRIDE);
/* Since Linux 5.12, if we want to update /proc/self/uid_map to create * a mapping from UID 0, which only happens with pasta spawning a child
-- Stefano
-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
On 10/7/25 12:02 PM, Stefano Brivio wrote:
[Cc: Yumei as this is somewhat related to https://archives.passt.top/passt-dev/20250926011714.5978-1-yuhuang@redhat.co..., and David as he wrote most of this part]
On Tue, 7 Oct 2025 08:16:39 -0400 Cole Robinson
wrote: Reproducer that I'd expect to work
$ cd $HOME $ sudo passt --runas $UID --socket foo.sock Failed to bind UNIX domain socket: Permission denied
A more practical example is for libguestfs apps when run as user=root.
+ libguestfs connects to libvirt qemu:///system + libvirt qemu:///system defaults to user=qemu. + chowns passt runtime dir to user=qemu + libguestfs instead requests the VM run as user=root + patches in progress but we are blocked by this issue + passt is launched as root, but can't open socket in passt dir.
Obviously libvirt needs improvements too. But it seems like this is a defect as well.
Thanks for the patch! I think it's absolutely unproblematic to keep CAP_DAC_OVERRIDE for a moment at the beginning. Did you figure out exactly why it's needed by the way?
Last line in the list above should read: + passt is launched as root, but can't open socket in passt dir because it's owned by qemu.qemu
Signed-off-by: Cole Robinson
Should we add:
Link: https://github.com/libguestfs/libguestfs/pull/218
? Or it's misleading, or you omitted it for any other reason?
Works for me! I did not intentionally omit it Thanks, Cole
On Tue, Oct 07, 2025 at 08:16:39AM -0400, Cole Robinson wrote:
Reproducer that I'd expect to work
$ cd $HOME $ sudo passt --runas $UID --socket foo.sock Failed to bind UNIX domain socket: Permission denied
A more practical example is for libguestfs apps when run as user=root.
+ libguestfs connects to libvirt qemu:///system + libvirt qemu:///system defaults to user=qemu. + chowns passt runtime dir to user=qemu + libguestfs instead requests the VM run as user=root + patches in progress but we are blocked by this issue + passt is launched as root, but can't open socket in passt dir.
Obviously libvirt needs improvements too. But it seems like this is a defect as well.
Signed-off-by: Cole Robinson
Reviewed-by: David Gibson
--- isolation.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/isolation.c b/isolation.c index bbcd23b..b25f349 100644 --- a/isolation.c +++ b/isolation.c @@ -188,6 +188,9 @@ void isolate_initial(int argc, char **argv) * We have to keep CAP_SETUID and CAP_SETGID at this stage, so * that we can switch user away from root. * + * CAP_DAC_OVERRIDE may be required for socket setup when combined + * with --runas. + * * We have to keep some capabilities for the --netns-only case: * - CAP_SYS_ADMIN, so that we can setns() to the netns. * - Keep CAP_NET_ADMIN, so that we can configure interfaces @@ -198,7 +201,7 @@ void isolate_initial(int argc, char **argv) * isolate_prefork(). */ keep = BIT(CAP_NET_BIND_SERVICE) | BIT(CAP_SETUID) | BIT(CAP_SETGID) | - BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN); + BIT(CAP_SYS_ADMIN) | BIT(CAP_NET_ADMIN) | BIT(CAP_DAC_OVERRIDE);
/* Since Linux 5.12, if we want to update /proc/self/uid_map to create * a mapping from UID 0, which only happens with pasta spawning a child -- 2.51.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
participants (4)
-
Cole Robinson
-
David Gibson
-
Richard W.M. Jones
-
Stefano Brivio