On Fri, 16 May 2025 14:35:14 +0200
Paul Holzinger
On 16/05/2025 14:22, Max Chernoff wrote:
Hi Paul,
On Fri, 2025-05-16 at 13:59 +0200, Paul Holzinger wrote:
So I did test this patch with podman's system and e2e test on podman v5.5.0 on fedora rawhide and I noticed one problem that caused some failures:
podman build is broken with this policy. And I assume that means buildah would not work as well. The difference is that in the build case we do not pass a bind mounted namespace path under /run but rather /proc/$pid/ns/net as path to pasta. We get this error:
pasta failed with exit code 1: Couldn't open network namespace /proc/360143/ns/net: Permission denied
Logged avc: denied { search } for pid=360144 comm="pasta.avx2" name="360143" dev="proc" ino=2030208 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=0 Odd, it works for me:
$ id -Z user_u:user_r:user_t:s0-s0:c0.c1023
$ podman --version podman version 5.4.2
$ pasta --version pasta 0^20250512.g8ec1341-1.fc42.x86_64 Copyright Red Hat GNU General Public License, version 2 or later https://www.gnu.org/licenses/old-licenses/gpl-2.0.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
$ cat Containerfile FROM registry.fedoraproject.org/fedora-minimal:42 RUN dnf install --assumeyes python3
$ podman build --no-cache --network=pasta . STEP 1/2: FROM registry.fedoraproject.org/fedora-minimal:42 STEP 2/2: RUN dnf install --assumeyes python3 Updating and loading repositories: Fedora 42 - x86_64 - Updates 100% | 8.3 MiB/s | 6.8 MiB | 00m01s Fedora 42 openh264 (From Cisco) - x86_ 100% | 7.7 KiB/s | 6.0 KiB | 00m01s Fedora 42 - x86_64 100% | 12.3 MiB/s | 35.4 MiB | 00m03s Repositories loaded. Package Arch Version Repository Size Installing: python3 x86_64 3.13.3-2.fc42 updates 28.7 KiB Installing dependencies: expat x86_64 2.7.1-1.fc42 fedora 290.2 KiB libb2 x86_64 0.98.1-13.fc42 fedora 46.1 KiB libgomp x86_64 15.1.1-1.fc42 updates 538.5 KiB mpdecimal x86_64 4.0.1-1.fc42 updates 217.2 KiB python-pip-wheel noarch 24.3.1-2.fc42 fedora 1.2 MiB python3-libs x86_64 3.13.3-2.fc42 updates 39.9 MiB readline x86_64 8.2-13.fc42 fedora 485.0 KiB tzdata noarch 2025b-1.fc42 fedora 1.6 MiB Installing weak dependencies: python-unversioned-command noarch 3.13.3-2.fc42 updates 23.0 B
Transaction Summary: Installing: 10 packages
Total size of inbound packages is 12 MiB. Need to download 12 MiB. After this operation, 44 MiB extra will be used (install 44 MiB, remove 0 B). [ 1/10] python3-0:3.13.3-2.fc42.x86_64 100% | 109.6 KiB/s | 29.7 KiB | 00m00s [...] [12/12] Installing python-unversioned-c 100% | 9.6 KiB/s | 424.0 B | 00m00s Complete! COMMIT --> edfb5d3fee4c edfb5d3fee4c729c0ec373150bd382e5a8461bc6ce18b14bcc12606d65ee185f
$ ps auxZ | grep pasta # In another terminal while the above is running user_u:user_r:container_runtime_t:s0-s0:c0.c1023 test-us+ 497555 0.4 0.1 2533448 48028 pts/2 Sl+ 06:11 0:00 podman build --no-cache --network=pasta . user_u:user_r:pasta_t:s0-s0:c0.c1023 test-us+ 497680 1.1 0.0 206444 17188 ? Ss 06:11 0:00 /usr/sbin/pasta --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /proc/497672/ns/net --map-guest-addr 169.254.1.2
What are the SELinux contexts of the network namespaces? This is what I get:
$ ls -laZ $XDG_RUNTIME_DIR/netns $XDG_RUNTIME_DIR/containers/networks/rootless-netns /proc/self/ns/net ls: cannot access '/run/user/959/netns': No such file or directory lrwxrwxrwx. 1 test-user test-user user_u:user_r:user_t:s0-s0:c0.c1023 0 May 16 06:15 /proc/self/ns/net -> 'net:[4026531840]'
It seems to be unconfined for me
lrwxrwxrwx. 1 test test unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 0 May 16 08:32 /proc/self/ns/net -> 'net:[4026531840]'
/run/user/959/containers/networks/rootless-netns: total 0 drwx------. 2 test-user test-user user_u:object_r:ifconfig_var_run_t:s0 40 May 16 06:05 ./ drwx------. 3 test-user test-user user_u:object_r:user_tmp_t:s0 60 May 16 06:05 ../
/run/user/1001/containers/networks/rootless-netns: total 0 drwx------. 2 test test unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 16 06:26 . drwx------. 4 test test unconfined_u:object_r:user_tmp_t:s0 120 May 16 06:26 ..
/run/user/1001/netns: total 0 drwxr-xr-x. 2 test test unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 16 07:31 . drwx------. 9 test test unconfined_u:object_r:user_tmp_t:s0 200 May 16 06:19 ..
I'm getting the same issue with 'podman build' and the Containerfile shared by Max. Running with SELinux in permissive mode, I'm getting: # cat /var/log/audit/audit.log type=AVC msg=audit(1747410763.621:130615): avc: denied { search } for pid=1352409 comm="pasta.avx2" name="1352408" dev="proc" ino=7022238 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410763.621:130616): avc: denied { read } for pid=1352409 comm="pasta.avx2" name="net" dev="proc" ino=7022285 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=lnk_file permissive=1 type=AVC msg=audit(1747410763.622:130617): avc: denied { read } for pid=1352409 comm="pasta.avx2" scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=file permissive=1 type=AVC msg=audit(1747410763.622:130618): avc: denied { read } for pid=1352409 comm="pasta.avx2" name="ns" dev="proc" ino=7022284 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410763.622:130619): avc: denied { open } for pid=1352409 comm="pasta.avx2" path="/proc/1352408/ns" dev="proc" ino=7022284 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410764.622:130620): avc: denied { read } for pid=1352417 comm="pasta.avx2" name="net" dev="proc" ino=7022285 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c609,c838 tclass=lnk_file permissive=1 and: # audit2allow -a #============= pasta_t ============== allow pasta_t container_runtime_t:dir { open read search }; allow pasta_t container_runtime_t:file read; allow pasta_t container_runtime_t:lnk_file read; allow pasta_t container_t:lnk_file read; If I add those rules, everything works (well, I'm not saying that's the solution...). This is a Fedora virtual machine with: # uname -a Linux passt.top 6.11.0-0.rc3.30.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Aug 12 14:18:21 UTC 2024 x86_64 GNU/Linux # rpm -qe podman passt podman-5.5.0~rc2-1.fc43.x86_64 passt-0^20250512.g8ec1341-1.fc43.x86_64 To me those denials look reasonable, in the sense that I would expect the namespace links to have container_runtime_t type. By the way: $ ls -laZ $XDG_RUNTIME_DIR/netns $XDG_RUNTIME_DIR/containers/networks/rootless-netns /proc/self/ns/net lrwxrwxrwx. 1 sbrivio sbrivio unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 0 May 16 15:59 /proc/self/ns/net -> 'net:[4026531840]' /run/user/1001/containers/networks/rootless-netns: total 0 drwx------. 2 sbrivio sbrivio unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 15 15:00 . drwx------. 4 sbrivio sbrivio unconfined_u:object_r:user_tmp_t:s0 120 May 15 15:00 .. /run/user/1001/netns: total 0 drwxr-xr-x. 2 sbrivio sbrivio unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 15 15:00 . drwx------. 8 sbrivio sbrivio unconfined_u:object_r:user_tmp_t:s0 220 May 6 08:02 .. Max, could it be that you're running stuff with some customised SELinux policy? By the way, with "unconfined disabled": https://bugzilla.redhat.com/show_bug.cgi?id=2330512 we seem to have unconfined_t as type for those links: type=AVC msg=audit(1733378482.320:31258): avc: denied { open } for pid=651955 comm="pasta.avx2" path="/proc/651954/ns" dev="proc" ino=2904841 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 ...but I'm not sure at which point in time exactly.
I am not familiar with the selinux stuff but if this is a boolean that users can configure should this be documented in the man page here? I guess more documentation is always a good thing, but most of the other container-related SELinux booleans seem to be undocumented:
$ sudo semanage boolean --list | grep ^container_ container_connect_any (off , off) Determine whether container can connect to all TCP ports. container_manage_cgroup (on , on) Allow sandbox containers to manage cgroup (systemd) container_read_certs (off , off) Allow all container domains to read cert files and directories container_use_cephfs (off , off) Determine whether container can use ceph file system container_use_devices (off , off) Allow containers to use any device volume mounted into container container_use_dri_devices (on , on) Allow containers to use any dri device volume mounted into container container_use_ecryptfs (off , off) Determine whether container can use ecrypt file system container_use_xserver_devices (off , off) Allow containers to use any xserver device volume mounted into container, mostly used for GPU acceleration container_user_exec_content (on , on) Allow container to user exec content
$ man -wK container_connect_any No manual entry for container_connect_any
$ man -wK container_manage_cgroup /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man7/podman-troubleshooting.7.gz
$ man -wK container_read_certs No manual entry for container_read_certs
$ man -wK container_use_cephfs No manual entry for container_use_cephfs
$ man -wK container_use_devices /usr/share/man/man1/sesearch.1.gz /usr/share/man/man1/podman-pod-clone.1.gz /usr/share/man/man1/podman-pod-create.1.gz /usr/share/man/man1/podman-build.1.gz /usr/share/man/man1/podman-farm-build.1.gz /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man8/setsebool.8.gz
$ man -wK container_user_exec_content No manual entry for container_user_exec_content
I'll send a patch for the man pages tomorrow.
Wait a moment. I don't think something SELinux-specific belongs to pasta's man page, because that's not relevant for all users and distributions. We could maintain that as an addition for Fedora and perhaps Gentoo, but I wonder if it's really worth the effort. Besides, I think that: # semanage boolean --list | grep pasta pasta_allow_bind_any_port (on , on) Allow pasta to allow bind any port ...this is the common practice to document those knobs (and where I usually look for things). We wouldn't have much to add to this anyway. -- Stefano