[Cc'ing Rich... Rich, see my first question below]
On Wed, 24 Sep 2025 09:58:57 +0800
Yumei Huang
On Tue, Sep 23, 2025 at 6:32 PM Stefano Brivio
wrote: On Tue, 23 Sep 2025 14:36:41 +0800 Yumei Huang
wrote: On Tue, Sep 23, 2025 at 4:03 AM Stefano Brivio
wrote: On Mon, 22 Sep 2025 11:03:23 +0800 Yumei Huang
wrote: On Fri, Sep 19, 2025 at 5:58 PM Stefano Brivio
wrote: On Fri, 19 Sep 2025 09:43:29 +0800 Yumei Huang
wrote: > Signed-off-by: Yumei Huang
> --- > test/README.md | 31 +++++++++++++++++++++++++++++-- > 1 file changed, 29 insertions(+), 2 deletions(-) > > diff --git a/test/README.md b/test/README.md > index 91ca603..e3e9d37 100644 > --- a/test/README.md > +++ b/test/README.md > @@ -32,7 +32,7 @@ Example for Debian, and possibly most Debian-based distributions: > git go iperf3 isc-dhcp-common jq libgpgme-dev libseccomp-dev linux-cpupower > lm-sensors lz4 netavark netcat-openbsd psmisc qemu-efi-aarch64 > qemu-system-arm qemu-system-misc qemu-system-ppc qemu-system-x86 > - qemu-system-x86 sipcalc socat strace tmux uidmap valgrind > + sipcalc socat strace tmux uidmap valgrind > > NOTE: the tests need a qemu version >= 7.2, or one that contains commit > 13c6be96618c ("net: stream: add unix socket"): this change introduces support > @@ -81,7 +81,12 @@ The following additional packages are commonly needed: > > ## Regular test > > -Just issue: > +Before running the tests, you need to prepare the required assets: > + > + cd test > + make assets > + > +Then issue: > > ./run > > @@ -91,6 +96,28 @@ variable settings: DEBUG=1 enables debugging messages, TRACE=1 enables tracing > > PCAP=1 TRACE=1 ./run > > +**Note:** > + > +* It's recommended to run the commands as a non-root user. > + Due to [Bug 967509](https://bugzilla.redhat.com/show_bug.cgi?id=967509), > + if you switch users with `su` or `sudo`, the directory `/run/user/ID` may > + not be created. In that case, `XDG_RUNTIME_DIR` will incorrectly point to > + `/run/user/0` instead of `/run/user/ID`, which can cause error. Thanks for the research, I wasn't aware of that, and recently spent quite some time figuring that out (for other reasons):
https://issues.redhat.com/browse/RHEL-70222
in that case, XDG_RUNTIME_DIR was simply not set. Things were working with 'machinectl shell' instead.
At the same time: running this whole stuff as root sounds rather crazy, unless it's a throw-away VMs with absolutely nothing important on it.
That is, regardless of the issue with XDG_RUNTIME_DIR. I would maybe make the wording stronger, something like:
* Don't run the tests as root, it's not needed! * If you really need to, note that ...
> + **Workaround:** Log out and log back in as the intended user to ensure the > + correct runtime directory is set up.
We could also suggest 'machinectl shell' if it's really needed for whatever reason.
I'm not sure how 'machinectl shell' works here. The error happens when running 'make assets', which calls 'prepare-distro-img.sh' script, which calls 'virsh edit'.
Ah, I didn't know! So this is actually similar to https://issues.redhat.com/browse/RHEL-70222.
If we run 'make assets' with root, the error is like this:
./prepare-distro-img.sh prepared-debian-8.11.0-openstack-amd64.qcow2 libguestfs: error: could not create appliance through libvirt. Original error from libvirt: Cannot access storage file '/home/test/passt/test/prepared-debian-8.11.0-openstack-amd64.qcow2' (as uid:107, gid:107): Permission denied [code=38 int1=13]
If we switch to a non-root user via 'su', the error is like this:
./prepare-distro-img.sh prepared-debian-8.11.0-openstack-amd64.qcow2 libvirt: XML-RPC error : Cannot create user runtime directory '/run/user/0/libvirt': Permission denied libguestfs: error: could not connect to libvirt (URI = qemu:///session): Cannot create user runtime directory '/run/user/0/libvirt': Permission denied [code=38 int1=13] make: *** [Makefile:115: prepared-debian-8.11.0-openstack-amd64.qcow2] Error 1
Do you mean to run 'make assets' with 'machinectl shell'? What's the exact cmd here? I tried this, seems not work.
# machinectl shell --uid=$(id -u pat) .host /home/test/passt/test/make assets Connected to the local host. Press ^] three times within 1s to exit session.
Connection to the local host terminated.
No, I mean using 'machinectl shell' instead of 'su' (it's intended as a replacement), that is:
$ machinectl shell # make assets
...because that one will set XDG_RUNTIME_DIR.
Yes, 'machinectl shell' will solve the issue when switching to a non-root user via su. But it doesn't solve the issue when running 'make assets' as root. They are actually different issues as above.
Can one need specify a XDG_RUNTIME_DIR that actually exists, maybe? Does that work?
I guess I need to clarify the issues more clearly.
a) If we login the system with the non-root user, `/run/user/ID` is created and XDG_RUNTIME_DIR is pointing to that correctly. So 'make assets' works well.
b) If we login the system with root, then switch to a non-root user via 'su', 'make assets' fails due to Bug 967509. XDG_RUNTIME_DIR is not reset and points to /run/user/(ID of the previous user), which is /run/user/0.
libguestfs: error: could not connect to libvirt (URI = qemu:///session): Cannot create user runtime directory '/run/user/0/libvirt': Permission denied [code=38 int1=13]
Switching the user with 'machinectl shell --uid=$user' can solve the issue.
c) If we run 'make assets' as root, (no matter we just login with root, or switch to root via su or machinectl shell), 'make assets' always fails with a different error.
libguestfs: error: could not create appliance through libvirt. Original error from libvirt: Cannot access storage file '/home/pat/tmp/t5-passt/test/prepared-debian-10-nocloud-amd64.qcow2' (as uid:107, gid:107): Permission denied [code=38 int1=13]
Ah, look, UID 107 is usually QEMU on Fedora / RHEL, so libguestfs is switching to that. But it shouldn't, because then it won't be able to access the images you downloaded as root. Rich, do you know why it happens?
The XDG_RUNTIME_DIR is no longer an issue, since root can access every directory under /run/user. I guess the problem here is that we just can't run 'virsh edit' as root.
Wait, it's not 'virsh edit', it's 'virt-edit', and it's not true that you can't run it as root. We had explicit reports from libguestfs (virt-edit is part of it, and it now uses passt to provide networking to the temporary VMs it uses to edit guest images) being run as root, and passt breaking that in the past. We need to support that because virtual machine images might be owned by root if the virtual machines they belong to can't run unprivileged. Breaking operation as root is actually pretty bad for security in that case, since it encourages / forces users to make those images accessible to other users. See: https://issues.redhat.com/browse/RHEL-36045 45b8632dcc0e ("conf: Don't lecture user about starting us as root") c9b241346569 ("conf, passt, tap: Open socket and PID files before switching UID/GID").
Maybe we can just put it like:
Running the commands as root is just not allowed. If you login the system with root, don't use su to switch users due to [Bug 967509](https://bugzilla.redhat.com/show_bug.cgi?id=967509). Log out and log back in as the intended user, or use 'machinectl shell --uid=$user'.
What do you think?
Well, it's free software, so "not allowed" doesn't really mean much.
I would simply warn users that it's a bad idea and it's not needed, something like my previous proposal:
* Don't run the tests as root, it's not needed! * If you really need to, note that ...
and then just list the workaround that actually works.
I think the most typical need for running things as root is that you don't actually have other users (it happens with some VM images or in embedded systems), so 'machinectl shell --uid=$user' won't really help there.
Well, I have to admit that I usually do everything with root on my test machines. And I don't see a solution/workaround to fix the issue when running 'make assets' as root as c).
It's not so important for our tests, but it would be good to know why it breaks, in general.
The workaround proposed is just for those who login with root and switch to a non-root user to run the tests.
Maybe just mention setting XDG_RUNTIME_DIR to whatever is appropriate (does /tmp work?)?
> +* SELinux may prevent the tests from running correctly. To avoid this, > + temporarily disable it by running: > + > + setenforce 0
By the way, other than the DHCP client not working on Fedora in a namespace (which we should really fix, I can look into it if you share the messages you're getting from /var/log/audit/audit.log), did you hit any other issue with it?
Sure, I will send you a link containing the audit.log. BTW, if 'setenforce 1', the tests would get stuck at 'DHCPv6 :address'. Looks like an endless loop there. So except the very first few tests, other tests haven't been executed.
=== pasta/dhcp DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:DEBUG:> Interface name DEBUG:DEBUG:? [ -n "eno8303" ] DEBUG:DEBUG:...passed.
DHCP: address DEBUG:DEBUG:DEBUG:DEBUG:? [ @EMPTY@ = 10.72.136.30 ] < failed. DEBUG:DEBUG:...failed.
DHCP: route DEBUG:DEBUG:DEBUG:? [ @EMPTY@ = 10.72.139.254 ] < failed. DEBUG:DEBUG:...failed.
DHCP: MTU DEBUG:DEBUG:? [ 1500 = 65520 ] < failed. DEBUG:DEBUG:...failed.
DHCPv6: address DEBUG:
Thanks, so it's the issue David recently mentioned with dhclient-script(8) being prevented by SELinux from setting up addresses and routes via ip(8), even though it's in a detached namespace and that should be allowed.
We should probably add something like we do in contrib/selinux/pasta.te to https://github.com/fedora-selinux/selinux-policy:
roleattribute user_r usernetctl_roles; role usernetctl_roles types <whatever dhclient runs as>;
Now, we can disable SELinux temporarily to run tests, but eventually we'll want to have tests with DHCP clients in unprivileged setups also as part of Fedora automated tests, and I'm fairly sure that those run with SELinux in enforcing mode. So we should really fix this.
Sure, I will file a ticket for that.
Thanks. Note that it's not a ticket for passt, it's maybe a ticket for fedora-selinux, but I'm not sure if it's really helpful to file issues there. I guess we should try things out and send a merge request.
Agree. Let's just disable it temporarily to bypass the issue.
Actually, that's not what I meant: I really think we should fix that. I'm just saying that filing tickets there is usually not very helpful. Anyway, noted on my list, let me take care of it, the change itself is kind of trivial. Whether it will be considered / accepted is another story, but I can try at least.
By the way, it would also be interesting to see if, once the test suite gets past this point, you get further messages in audit.log.
If you do 'setenforce 0', SELinux switches to the so-called "complain mode", and warnings are still logged, but they won't block anything.
So, with 'setenforce 0', we can find out from audit.log if we would hit further failures.
Here is the audit.log:
https://privatebin.corp.redhat.com/?49fef5ef2e766c42#Fki5LnD9EGMfDDpmvFPp1Wq...
From what I can see, there is no 'avc: denied' after the dhcp cases.
[looking at log you attached later] great, I don't see other issues either! So it seems to be just that, and then we'll be able to run tests on Fedora-like distributions with SELinux enabled.
-- Stefano