[PATCH 0/1] selinux: Transition to pasta_t in containers
Hi, Currently, pasta runs in the container_runtime_exec_t context when running in a container. This commit updates the SELinux policy so that pasta instead runs in the pasta_t context. I'm more familiar with CIL, so I initially developed the modified policy in CIL, and then later ported it to the kernel policy language. My original CIL source is available here: https://github.com/gucci-on-fleek/maxchernoff.ca/blob/master/etc/selinux/loc... I've tested this on Fedora 42 with rootless Podman, with both unconfined (unconfined_u) and confined (user_u) users, and with both TCP and UDP. I've never actually used the email workflow for Git before, so please let me know if I've done something wrong. Thanks, -- Max Max Chernoff (1): selinux: Transition to pasta_t in containers contrib/selinux/pasta.fc | 10 ++++++---- contrib/selinux/pasta.te | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 4 deletions(-) -- 2.49.0
Currently, pasta runs in the container_runtime_exec_t context when
running in a container. This is not ideal since it means that pasta runs
with more privileges than strictly necessary. This commit updates the
SELinux policy to have pasta transition to the pasta_t context when
started from the container_runtime_t context, adds the appropriate
labels to $XDG_RUNTIME_DIR/netns and
$XDG_RUNTIME_DIR/containers/networks/rootless-netns, and grants the
necessary permissions to the pasta_t context.
Link: https://bugs.passt.top/show_bug.cgi?id=81
Link: https://github.com/containers/podman/discussions/26100#discussioncomment-130...
Signed-off-by: Max Chernoff
On Wed, 14 May 2025 04:44:12 -0600
Max Chernoff
Currently, pasta runs in the container_runtime_exec_t context when running in a container. This is not ideal since it means that pasta runs with more privileges than strictly necessary. This commit updates the SELinux policy to have pasta transition to the pasta_t context when started from the container_runtime_t context, adds the appropriate labels to $XDG_RUNTIME_DIR/netns and $XDG_RUNTIME_DIR/containers/networks/rootless-netns, and grants the necessary permissions to the pasta_t context.
Link: https://bugs.passt.top/show_bug.cgi?id=81 Link: https://github.com/containers/podman/discussions/26100#discussioncomment-130... Signed-off-by: Max Chernoff
Thanks, I think that with your patch we're almost there. (!) I ran Podman tests covering pasta on Fedora Rawhide, with the updated profile (that is, 'bats test/system/505-networking-pasta.bats' from a Podman tree) and it looks like there are a couple of minor things missing, though. Tests pass, but on a number of tests I'm getting these in the audit log: type=AVC msg=audit(1747313163.407:129988): avc: denied { nlmsg_read } for pid=1313607 comm="ss" scontext=system_u:system_r:container_t:s0:c752,c999 tcontext=system_u:system_r:container_t:s0:c752,c999 tclass=netlink_tcpdiag_socket permissive=0 type=AVC msg=audit(1747313164.090:129989): avc: denied { getattr } for pid=1313686 comm="pasta.avx2" path="pipe:[6839919]" dev="pipefs" ino=6839919 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=fifo_file permissive=0 type=AVC msg=audit(1747313164.209:129990): avc: denied { getattr } for pid=1313714 comm="pasta.avx2" path="pipe:[6840012]" dev="pipefs" ino=6840012 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=fifo_file permissive=0 The 'ss' thing is unrelated, and might be something to add to container-selinux, perhaps. I'm not really sure if containers should reasonably be able to access netlink_tcpdiag_socket. The getattr on pipes, though, is pasta trying to read out attributes of pipes that are used for loopback connections, that is, the path represented here (orange square on top) as "tap bypass": https://passt.top/#pasta-pack-a-subtle-tap-abstraction if those fail, by the way, things still work (I guess it's just what we do to probe / tune the size of the pipes). A summary from audit2allow: #============= container_t ============== #!!!! This avc can be allowed using the boolean 'virt_sandbox_use_netlink' allow container_t self:netlink_tcpdiag_socket nlmsg_read; #============= pasta_t ============== allow pasta_t container_runtime_t:fifo_file getattr; I plan to try again later (probably in a few hours) to add what's missing (it could very well be just this rule) and get back to you. Of course, if you manage to fix / re-test meanwhile, before I get to it, feel free to re-post this. -- Stefano
On Thu, 15 May 2025 15:40:35 +0200
Stefano Brivio
On Wed, 14 May 2025 04:44:12 -0600 Max Chernoff
wrote: Currently, pasta runs in the container_runtime_exec_t context when running in a container. This is not ideal since it means that pasta runs with more privileges than strictly necessary. This commit updates the SELinux policy to have pasta transition to the pasta_t context when started from the container_runtime_t context, adds the appropriate labels to $XDG_RUNTIME_DIR/netns and $XDG_RUNTIME_DIR/containers/networks/rootless-netns, and grants the necessary permissions to the pasta_t context.
Link: https://bugs.passt.top/show_bug.cgi?id=81 Link: https://github.com/containers/podman/discussions/26100#discussioncomment-130... Signed-off-by: Max Chernoff
Thanks, I think that with your patch we're almost there. (!)
I ran Podman tests covering pasta on Fedora Rawhide, with the updated profile (that is, 'bats test/system/505-networking-pasta.bats' from a Podman tree) and it looks like there are a couple of minor things missing, though.
Tests pass, but on a number of tests I'm getting these in the audit log:
type=AVC msg=audit(1747313163.407:129988): avc: denied { nlmsg_read } for pid=1313607 comm="ss" scontext=system_u:system_r:container_t:s0:c752,c999 tcontext=system_u:system_r:container_t:s0:c752,c999 tclass=netlink_tcpdiag_socket permissive=0 type=AVC msg=audit(1747313164.090:129989): avc: denied { getattr } for pid=1313686 comm="pasta.avx2" path="pipe:[6839919]" dev="pipefs" ino=6839919 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=fifo_file permissive=0 type=AVC msg=audit(1747313164.209:129990): avc: denied { getattr } for pid=1313714 comm="pasta.avx2" path="pipe:[6840012]" dev="pipefs" ino=6840012 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=fifo_file permissive=0
The 'ss' thing is unrelated, and might be something to add to container-selinux, perhaps. I'm not really sure if containers should reasonably be able to access netlink_tcpdiag_socket.
The getattr on pipes, though, is pasta trying to read out attributes of pipes that are used for loopback connections, that is, the path represented here (orange square on top) as "tap bypass":
https://passt.top/#pasta-pack-a-subtle-tap-abstraction
if those fail, by the way, things still work (I guess it's just what we do to probe / tune the size of the pipes).
A summary from audit2allow:
#============= container_t ==============
#!!!! This avc can be allowed using the boolean 'virt_sandbox_use_netlink' allow container_t self:netlink_tcpdiag_socket nlmsg_read;
#============= pasta_t ============== allow pasta_t container_runtime_t:fifo_file getattr;
I plan to try again later (probably in a few hours) to add what's missing (it could very well be just this rule) and get back to you. Of course, if you manage to fix / re-test meanwhile, before I get to it, feel free to re-post this.
Yes, adding getattr on fifo_file makes the tests pass without any SELinux warning. Full review of your patch:
diff --git a/contrib/selinux/pasta.fc b/contrib/selinux/pasta.fc index 41ee46d..3be7789 100644 --- a/contrib/selinux/pasta.fc +++ b/contrib/selinux/pasta.fc @@ -8,7 +8,9 @@ # Copyright (c) 2022 Red Hat GmbH # Author: Stefano Brivio
-/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 -/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 -/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 -/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 +/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 +/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 +/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/run/user/%{USERID}/netns system_u:object_r:ifconfig_var_run_t:s0 +/run/user/%{USERID}/containers/networks/rootless-netns system_u:object_r:ifconfig_var_run_t:s0 diff --git a/contrib/selinux/pasta.te b/contrib/selinux/pasta.te index 89c8043..e97fd88 100644 --- a/contrib/selinux/pasta.te +++ b/contrib/selinux/pasta.te @@ -89,6 +89,13 @@ require { class capability { sys_tty_config setuid setgid }; class cap_userns { setpcap sys_admin sys_ptrace net_bind_service net_admin }; class user_namespace create; + + # Container requires + attribute_role usernetctl_roles; + role container_user_r; + role staff_r; + role user_r; + type container_runtime_t; }
type pasta_t; @@ -213,3 +220,32 @@ allow pasta_t netutils_t:process { noatsecure rlimitinh siginh }; allow pasta_t ping_t:process { noatsecure rlimitinh siginh }; allow pasta_t user_tty_device_t:chr_file { append read write }; allow pasta_t user_devpts_t:chr_file { append read write }; + +# Allow network administration commands for non-privileged users +roleattribute container_user_r usernetctl_roles; +roleattribute staff_r usernetctl_roles; +roleattribute user_r usernetctl_roles; +role usernetctl_roles types pasta_t; + +# Make pasta in a container run under the pasta_t context +type_transition container_runtime_t pasta_exec_t : process pasta_t; +allow container_runtime_t pasta_t:process transition; + +# Label the user network namespace files +type_transition container_runtime_t user_tmp_t : dir ifconfig_var_run_t "netns"; +type_transition container_runtime_t user_tmp_t : dir ifconfig_var_run_t "rootless-netns"; +allow pasta_t ifconfig_var_run_t:dir { add_name open rmdir write }; +allow pasta_t ifconfig_var_run_t:file { create open write }; + +# From audit2allow
Instead of these three "unsorted" rules:
+allow pasta_t container_runtime_t:fifo_file write;
...as I mentioned, changing this to: allow pasta_t container_runtime_t:fifo_file { write getattr }; fixes the remaining warning. And I think it should be "grouped" together with the TCP socket stuff above, that is, just after: corenet_tcp_bind_generic_node(pasta_t) because it's something we need for (loopback) TCP connections, together with TCP sockets.
+allow pasta_t self:cap_userns { setgid setuid };
Strictly speaking, this part shouldn't be needed, see points 7. and c. at: https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10 ...unfortunately, I never got any feedback about those and I haven't found the time to fix this in kernel either, so, sure, let's keep this rule to avoid noise. We could group this together with capabilities stuff, that is, just after: allow pasta_t self:cap_userns { setpcap sys_admin sys_ptrace net_admin net_bind_service }; (but separated, so that we can drop them without code churn) and maybe add a comment referencing: https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10 and the fact that setuid() and setgid() are always called with the current UID and GID in the detached user namespace.
+allow pasta_t tmpfs_t:filesystem getattr;
This is needed regardless of Podman, getattr was simply missing from: allow pasta_t tmpfs_t:filesystem mount; so I would rather add it there, together with mount.
+ +# Allow pasta to bind to any port +bool pasta_allow_bind_any_port true; +if (pasta_allow_bind_any_port) { + allow pasta_t port_type:icmp_socket { accept getopt name_bind }; + allow pasta_t port_type:tcp_socket { accept getopt name_bind name_connect }; + allow pasta_t port_type:udp_socket { accept getopt name_bind }; +}
Everything else looks good to me! If you want to re-post this, you can give --subject-prefix="PATCH v2" to git format-email. -- Stefano
On Wed, 14 May 2025 04:44:11 -0600
Max Chernoff
Hi,
Currently, pasta runs in the container_runtime_exec_t context when running in a container. This commit updates the SELinux policy so that pasta instead runs in the pasta_t context.
I'm more familiar with CIL, so I initially developed the modified policy in CIL, and then later ported it to the kernel policy language. My original CIL source is available here:
https://github.com/gucci-on-fleek/maxchernoff.ca/blob/master/etc/selinux/loc...
I've tested this on Fedora 42 with rootless Podman, with both unconfined (unconfined_u) and confined (user_u) users, and with both TCP and UDP.
I've never actually used the email workflow for Git before, so please let me know if I've done something wrong.
Thanks a lot! Nothing wrong workflow-wise, I'll look at your patch in a bit. I have to admit I hadn't thought of using 'type_transition' directly in pasta's policy, as opposed to having that in selinux-container, but it actually makes sense and it's nice to have everything managed here. -- Stefano
Hi Stefano, On Thu, 2025-05-15 at 17:55 +0200, Stefano Brivio wrote:
Instead of these three "unsorted" rules:
+allow pasta_t container_runtime_t:fifo_file write;
...as I mentioned, changing this to:
allow pasta_t container_runtime_t:fifo_file { write getattr };
fixes the remaining warning. And I think it should be "grouped" together with the TCP socket stuff above, that is, just after:
corenet_tcp_bind_generic_node(pasta_t)
because it's something we need for (loopback) TCP connections, together with TCP sockets.
Done.
+allow pasta_t self:cap_userns { setgid setuid };
Strictly speaking, this part shouldn't be needed, see points 7. and c. at:
https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10
...unfortunately, I never got any feedback about those and I haven't found the time to fix this in kernel either, so, sure, let's keep this rule to avoid noise. We could group this together with capabilities stuff, that is, just after:
allow pasta_t self:cap_userns { setpcap sys_admin sys_ptrace net_admin net_bind_service };
(but separated, so that we can drop them without code churn) and maybe add a comment referencing:
https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10
and the fact that setuid() and setgid() are always called with the current UID and GID in the detached user namespace.
If the denial is harmless (as mentioned in the bug), why not make it "dontaudit"? I've tested it out and it seems to work fine for me.
+allow pasta_t tmpfs_t:filesystem getattr;
This is needed regardless of Podman, getattr was simply missing from:
allow pasta_t tmpfs_t:filesystem mount;
so I would rather add it there, together with mount.
Done.
+# Allow pasta to bind to any port +bool pasta_allow_bind_any_port true; +if (pasta_allow_bind_any_port) { + allow pasta_t port_type:icmp_socket { accept getopt name_bind }; + allow pasta_t port_type:tcp_socket { accept getopt name_bind name_connect }; + allow pasta_t port_type:udp_socket { accept getopt name_bind }; +}
I renamed this to "pasta_bind_all_ports" since that better matches the preexisting booleans "git_session_bind_all_unreserved_ports", "mozilla_plugin_bind_unreserved_ports", and "tor_bind_all_unreserved_ports".
-/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 -/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 -/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 -/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 +/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 +/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 +/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/run/user/%{USERID}/netns system_u:object_r:ifconfig_var_run_t:s0 +/run/user/%{USERID}/containers/networks/rootless-netns system_u:object_r:ifconfig_var_run_t:s0
I also corrected the whitespace here to use tabs (instead of the awful tab-space mix that I accidentally used). Also, when this commit is eventually packaged, you'll need to run restorecon on /run/; otherwise you won't be able to start any containers until you log out and back in. I think that %selinux_relabel_post should handle this, but I'm not sure if it excludes /run/ or not. Thanks, -- Max Max Chernoff (1): selinux: Transition to pasta_t in containers contrib/selinux/pasta.fc | 10 ++++++---- contrib/selinux/pasta.te | 37 ++++++++++++++++++++++++++++++++++++- 2 files changed, 42 insertions(+), 5 deletions(-) -- 2.49.0
On Thu, 15 May 2025 23:11:02 -0600
Max Chernoff
Hi Stefano,
On Thu, 2025-05-15 at 17:55 +0200, Stefano Brivio wrote:
Instead of these three "unsorted" rules:
+allow pasta_t container_runtime_t:fifo_file write;
...as I mentioned, changing this to:
allow pasta_t container_runtime_t:fifo_file { write getattr };
fixes the remaining warning. And I think it should be "grouped" together with the TCP socket stuff above, that is, just after:
corenet_tcp_bind_generic_node(pasta_t)
because it's something we need for (loopback) TCP connections, together with TCP sockets.
Done.
+allow pasta_t self:cap_userns { setgid setuid };
Strictly speaking, this part shouldn't be needed, see points 7. and c. at:
https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10
...unfortunately, I never got any feedback about those and I haven't found the time to fix this in kernel either, so, sure, let's keep this rule to avoid noise. We could group this together with capabilities stuff, that is, just after:
allow pasta_t self:cap_userns { setpcap sys_admin sys_ptrace net_admin net_bind_service };
(but separated, so that we can drop them without code churn) and maybe add a comment referencing:
https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10
and the fact that setuid() and setgid() are always called with the current UID and GID in the detached user namespace.
If the denial is harmless (as mentioned in the bug), why not make it "dontaudit"? I've tested it out and it seems to work fine for me.
Because it reminds me I should send a kernel fix every time I see it ;) but that's not a good reason to scare users, so I think your approach is valid.
+allow pasta_t tmpfs_t:filesystem getattr;
This is needed regardless of Podman, getattr was simply missing from:
allow pasta_t tmpfs_t:filesystem mount;
so I would rather add it there, together with mount.
Done.
+# Allow pasta to bind to any port +bool pasta_allow_bind_any_port true; +if (pasta_allow_bind_any_port) { + allow pasta_t port_type:icmp_socket { accept getopt name_bind }; + allow pasta_t port_type:tcp_socket { accept getopt name_bind name_connect }; + allow pasta_t port_type:udp_socket { accept getopt name_bind }; +}
I renamed this to "pasta_bind_all_ports" since that better matches the preexisting booleans "git_session_bind_all_unreserved_ports", "mozilla_plugin_bind_unreserved_ports", and "tor_bind_all_unreserved_ports".
Ah, right, thanks for checking.
-/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 -/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 -/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 -/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 +/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 +/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 +/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/run/user/%{USERID}/netns system_u:object_r:ifconfig_var_run_t:s0 +/run/user/%{USERID}/containers/networks/rootless-netns system_u:object_r:ifconfig_var_run_t:s0
I also corrected the whitespace here to use tabs (instead of the awful tab-space mix that I accidentally used).
Also, when this commit is eventually packaged, you'll need to run restorecon on /run/; otherwise you won't be able to start any containers until you log out and back in. I think that %selinux_relabel_post should handle this, but I'm not sure if it excludes /run/ or not.
Oops, thanks for mentioning that. I indeed ran restorecon -R /run manually to test your change, and I thought %selinux_relabel_post would indeed take care of it on upgrades. But it looks like it doesn't. I checked with /var/run/pasta.pid and the label doesn't get fixed. fixfiles(8) has a: find /var/run \( -context "*:${UNLABELED}*" -o -context "*:${UNDEFINED}*" \) -exec chcon --no-dereference --reference /var/run {} \; which shouldn't however affect this. I couldn't quite find out where the issue is. Worst case, I'll add an explicit restorecon(8) call in the spec file (feel free to propose a change for that too, of course...). -- Stefano
Currently, pasta runs in the container_runtime_exec_t context when
running in a container. This is not ideal since it means that pasta runs
with more privileges than strictly necessary. This commit updates the
SELinux policy to have pasta transition to the pasta_t context when
started from the container_runtime_t context, adds the appropriate
labels to $XDG_RUNTIME_DIR/netns and
$XDG_RUNTIME_DIR/containers/networks/rootless-netns, and grants the
necessary permissions to the pasta_t context.
Link: https://bugs.passt.top/show_bug.cgi?id=81
Link: https://github.com/containers/podman/discussions/26100#discussioncomment-130...
Signed-off-by: Max Chernoff
Hi, podman maintainer here. On 16/05/2025 07:11, Max Chernoff wrote:
Currently, pasta runs in the container_runtime_exec_t context when running in a container. This is not ideal since it means that pasta runs with more privileges than strictly necessary. This commit updates the SELinux policy to have pasta transition to the pasta_t context when started from the container_runtime_t context, adds the appropriate labels to $XDG_RUNTIME_DIR/netns and $XDG_RUNTIME_DIR/containers/networks/rootless-netns, and grants the necessary permissions to the pasta_t context.
Link: https://bugs.passt.top/show_bug.cgi?id=81 Link: https://github.com/containers/podman/discussions/26100#discussioncomment-130... Signed-off-by: Max Chernoff
--- contrib/selinux/pasta.fc | 10 ++++++---- contrib/selinux/pasta.te | 37 ++++++++++++++++++++++++++++++++++++- 2 files changed, 42 insertions(+), 5 deletions(-)
So I did test this patch with podman's system and e2e test on podman v5.5.0 on fedora rawhide and I noticed one problem that caused some failures: podman build is broken with this policy. And I assume that means buildah would not work as well. The difference is that in the build case we do not pass a bind mounted namespace path under /run but rather /proc/$pid/ns/net as path to pasta. We get this error: pasta failed with exit code 1: Couldn't open network namespace /proc/360143/ns/net: Permission denied Logged avc: denied { search } for pid=360144 comm="pasta.avx2" name="360143" dev="proc" ino=2030208 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=0 The good news is that this the only problem I found.
diff --git a/contrib/selinux/pasta.fc b/contrib/selinux/pasta.fc index 41ee46d..e4aefc4 100644 --- a/contrib/selinux/pasta.fc +++ b/contrib/selinux/pasta.fc @@ -8,7 +8,9 @@ # Copyright (c) 2022 Red Hat GmbH # Author: Stefano Brivio
-/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 -/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 -/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 -/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 +/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 +/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 +/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/run/user/%{USERID}/netns system_u:object_r:ifconfig_var_run_t:s0 +/run/user/%{USERID}/containers/networks/rootless-netns system_u:object_r:ifconfig_var_run_t:s0 diff --git a/contrib/selinux/pasta.te b/contrib/selinux/pasta.te index 89c8043..7bcb451 100644 --- a/contrib/selinux/pasta.te +++ b/contrib/selinux/pasta.te @@ -89,6 +89,13 @@ require { class capability { sys_tty_config setuid setgid }; class cap_userns { setpcap sys_admin sys_ptrace net_bind_service net_admin }; class user_namespace create; + + # Container requires + attribute_role usernetctl_roles; + role container_user_r; + role staff_r; + role user_r; + type container_runtime_t; }
type pasta_t; @@ -113,6 +120,9 @@ init_daemon_domain(pasta_t, pasta_exec_t)
allow pasta_t self:capability { setpcap net_bind_service sys_tty_config dac_read_search net_admin sys_resource setuid setgid }; allow pasta_t self:cap_userns { setpcap sys_admin sys_ptrace net_admin net_bind_service }; +# pasta only calls setuid and setgid with the current UID and GID, so this +# denial is harmless. See https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10 +dontaudit pasta_t self:cap_userns { setgid setuid }; allow pasta_t self:user_namespace create;
auth_read_passwd(pasta_t) @@ -130,7 +140,7 @@ allow pasta_t user_home_t:file { open read getattr setattr execute execute_no_tr allow pasta_t user_home_dir_t:dir { search getattr open add_name read write }; allow pasta_t user_home_dir_t:file { create open read write }; allow pasta_t tmp_t:dir { add_name mounton remove_name write }; -allow pasta_t tmpfs_t:filesystem mount; +allow pasta_t tmpfs_t:filesystem { getattr mount }; allow pasta_t fs_t:filesystem unmount; allow pasta_t root_t:dir mounton; manage_files_pattern(pasta_t, pasta_pid_t, pasta_pid_t) @@ -156,6 +166,7 @@ allow pasta_t tmp_t:sock_file { create unlink write }; allow pasta_t self:tcp_socket create_stream_socket_perms; corenet_tcp_sendrecv_generic_node(pasta_t) corenet_tcp_bind_generic_node(pasta_t) +allow pasta_t container_runtime_t:fifo_file { getattr write }; allow pasta_t pasta_port_t:tcp_socket { name_bind name_connect }; allow pasta_t pasta_port_t:udp_socket { name_bind }; allow pasta_t http_port_t:tcp_socket { name_bind name_connect }; @@ -213,3 +224,27 @@ allow pasta_t netutils_t:process { noatsecure rlimitinh siginh }; allow pasta_t ping_t:process { noatsecure rlimitinh siginh }; allow pasta_t user_tty_device_t:chr_file { append read write }; allow pasta_t user_devpts_t:chr_file { append read write }; + +# Allow network administration commands for non-privileged users +roleattribute container_user_r usernetctl_roles; +roleattribute staff_r usernetctl_roles; +roleattribute user_r usernetctl_roles; +role usernetctl_roles types pasta_t; + +# Make pasta in a container run under the pasta_t context +type_transition container_runtime_t pasta_exec_t : process pasta_t; +allow container_runtime_t pasta_t:process transition; + +# Label the user network namespace files +type_transition container_runtime_t user_tmp_t : dir ifconfig_var_run_t "netns"; +type_transition container_runtime_t user_tmp_t : dir ifconfig_var_run_t "rootless-netns"; +allow pasta_t ifconfig_var_run_t:dir { add_name open rmdir write }; +allow pasta_t ifconfig_var_run_t:file { create open write }; + +# Allow pasta to bind to any port +bool pasta_bind_all_ports true; +if (pasta_bind_all_ports) {
I am not familiar with the selinux stuff but if this is a boolean that users can configure should this be documented in the man page here?
+ allow pasta_t port_type:icmp_socket { accept getopt name_bind }; + allow pasta_t port_type:tcp_socket { accept getopt name_bind name_connect }; + allow pasta_t port_type:udp_socket { accept getopt name_bind }; +}
-- Paul Holzinger
Hi Paul, On Fri, 2025-05-16 at 13:59 +0200, Paul Holzinger wrote:
So I did test this patch with podman's system and e2e test on podman v5.5.0 on fedora rawhide and I noticed one problem that caused some failures:
podman build is broken with this policy. And I assume that means buildah would not work as well. The difference is that in the build case we do not pass a bind mounted namespace path under /run but rather /proc/$pid/ns/net as path to pasta. We get this error:
pasta failed with exit code 1: Couldn't open network namespace /proc/360143/ns/net: Permission denied
Logged avc: denied { search } for pid=360144 comm="pasta.avx2" name="360143" dev="proc" ino=2030208 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=0
Odd, it works for me: $ id -Z user_u:user_r:user_t:s0-s0:c0.c1023 $ podman --version podman version 5.4.2 $ pasta --version pasta 0^20250512.g8ec1341-1.fc42.x86_64 Copyright Red Hat GNU General Public License, version 2 or later https://www.gnu.org/licenses/old-licenses/gpl-2.0.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. $ cat Containerfile FROM registry.fedoraproject.org/fedora-minimal:42 RUN dnf install --assumeyes python3 $ podman build --no-cache --network=pasta . STEP 1/2: FROM registry.fedoraproject.org/fedora-minimal:42 STEP 2/2: RUN dnf install --assumeyes python3 Updating and loading repositories: Fedora 42 - x86_64 - Updates 100% | 8.3 MiB/s | 6.8 MiB | 00m01s Fedora 42 openh264 (From Cisco) - x86_ 100% | 7.7 KiB/s | 6.0 KiB | 00m01s Fedora 42 - x86_64 100% | 12.3 MiB/s | 35.4 MiB | 00m03s Repositories loaded. Package Arch Version Repository Size Installing: python3 x86_64 3.13.3-2.fc42 updates 28.7 KiB Installing dependencies: expat x86_64 2.7.1-1.fc42 fedora 290.2 KiB libb2 x86_64 0.98.1-13.fc42 fedora 46.1 KiB libgomp x86_64 15.1.1-1.fc42 updates 538.5 KiB mpdecimal x86_64 4.0.1-1.fc42 updates 217.2 KiB python-pip-wheel noarch 24.3.1-2.fc42 fedora 1.2 MiB python3-libs x86_64 3.13.3-2.fc42 updates 39.9 MiB readline x86_64 8.2-13.fc42 fedora 485.0 KiB tzdata noarch 2025b-1.fc42 fedora 1.6 MiB Installing weak dependencies: python-unversioned-command noarch 3.13.3-2.fc42 updates 23.0 B Transaction Summary: Installing: 10 packages Total size of inbound packages is 12 MiB. Need to download 12 MiB. After this operation, 44 MiB extra will be used (install 44 MiB, remove 0 B). [ 1/10] python3-0:3.13.3-2.fc42.x86_64 100% | 109.6 KiB/s | 29.7 KiB | 00m00s [...] [12/12] Installing python-unversioned-c 100% | 9.6 KiB/s | 424.0 B | 00m00s Complete! COMMIT --> edfb5d3fee4c edfb5d3fee4c729c0ec373150bd382e5a8461bc6ce18b14bcc12606d65ee185f $ ps auxZ | grep pasta # In another terminal while the above is running user_u:user_r:container_runtime_t:s0-s0:c0.c1023 test-us+ 497555 0.4 0.1 2533448 48028 pts/2 Sl+ 06:11 0:00 podman build --no-cache --network=pasta . user_u:user_r:pasta_t:s0-s0:c0.c1023 test-us+ 497680 1.1 0.0 206444 17188 ? Ss 06:11 0:00 /usr/sbin/pasta --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /proc/497672/ns/net --map-guest-addr 169.254.1.2 What are the SELinux contexts of the network namespaces? This is what I get: $ ls -laZ $XDG_RUNTIME_DIR/netns $XDG_RUNTIME_DIR/containers/networks/rootless-netns /proc/self/ns/net ls: cannot access '/run/user/959/netns': No such file or directory lrwxrwxrwx. 1 test-user test-user user_u:user_r:user_t:s0-s0:c0.c1023 0 May 16 06:15 /proc/self/ns/net -> 'net:[4026531840]' /run/user/959/containers/networks/rootless-netns: total 0 drwx------. 2 test-user test-user user_u:object_r:ifconfig_var_run_t:s0 40 May 16 06:05 ./ drwx------. 3 test-user test-user user_u:object_r:user_tmp_t:s0 60 May 16 06:05 ../
I am not familiar with the selinux stuff but if this is a boolean that users can configure should this be documented in the man page here?
I guess more documentation is always a good thing, but most of the other container-related SELinux booleans seem to be undocumented: $ sudo semanage boolean --list | grep ^container_ container_connect_any (off , off) Determine whether container can connect to all TCP ports. container_manage_cgroup (on , on) Allow sandbox containers to manage cgroup (systemd) container_read_certs (off , off) Allow all container domains to read cert files and directories container_use_cephfs (off , off) Determine whether container can use ceph file system container_use_devices (off , off) Allow containers to use any device volume mounted into container container_use_dri_devices (on , on) Allow containers to use any dri device volume mounted into container container_use_ecryptfs (off , off) Determine whether container can use ecrypt file system container_use_xserver_devices (off , off) Allow containers to use any xserver device volume mounted into container, mostly used for GPU acceleration container_user_exec_content (on , on) Allow container to user exec content $ man -wK container_connect_any No manual entry for container_connect_any $ man -wK container_manage_cgroup /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man7/podman-troubleshooting.7.gz $ man -wK container_read_certs No manual entry for container_read_certs $ man -wK container_use_cephfs No manual entry for container_use_cephfs $ man -wK container_use_devices /usr/share/man/man1/sesearch.1.gz /usr/share/man/man1/podman-pod-clone.1.gz /usr/share/man/man1/podman-pod-create.1.gz /usr/share/man/man1/podman-build.1.gz /usr/share/man/man1/podman-farm-build.1.gz /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man8/setsebool.8.gz $ man -wK container_user_exec_content No manual entry for container_user_exec_content I'll send a patch for the man pages tomorrow. Thanks, -- Max
On 16/05/2025 14:22, Max Chernoff wrote:
Hi Paul,
On Fri, 2025-05-16 at 13:59 +0200, Paul Holzinger wrote:
So I did test this patch with podman's system and e2e test on podman v5.5.0 on fedora rawhide and I noticed one problem that caused some failures:
podman build is broken with this policy. And I assume that means buildah would not work as well. The difference is that in the build case we do not pass a bind mounted namespace path under /run but rather /proc/$pid/ns/net as path to pasta. We get this error:
pasta failed with exit code 1: Couldn't open network namespace /proc/360143/ns/net: Permission denied
Logged avc: denied { search } for pid=360144 comm="pasta.avx2" name="360143" dev="proc" ino=2030208 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=0 Odd, it works for me:
$ id -Z user_u:user_r:user_t:s0-s0:c0.c1023
$ podman --version podman version 5.4.2
$ pasta --version pasta 0^20250512.g8ec1341-1.fc42.x86_64 Copyright Red Hat GNU General Public License, version 2 or later https://www.gnu.org/licenses/old-licenses/gpl-2.0.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
$ cat Containerfile FROM registry.fedoraproject.org/fedora-minimal:42 RUN dnf install --assumeyes python3
$ podman build --no-cache --network=pasta . STEP 1/2: FROM registry.fedoraproject.org/fedora-minimal:42 STEP 2/2: RUN dnf install --assumeyes python3 Updating and loading repositories: Fedora 42 - x86_64 - Updates 100% | 8.3 MiB/s | 6.8 MiB | 00m01s Fedora 42 openh264 (From Cisco) - x86_ 100% | 7.7 KiB/s | 6.0 KiB | 00m01s Fedora 42 - x86_64 100% | 12.3 MiB/s | 35.4 MiB | 00m03s Repositories loaded. Package Arch Version Repository Size Installing: python3 x86_64 3.13.3-2.fc42 updates 28.7 KiB Installing dependencies: expat x86_64 2.7.1-1.fc42 fedora 290.2 KiB libb2 x86_64 0.98.1-13.fc42 fedora 46.1 KiB libgomp x86_64 15.1.1-1.fc42 updates 538.5 KiB mpdecimal x86_64 4.0.1-1.fc42 updates 217.2 KiB python-pip-wheel noarch 24.3.1-2.fc42 fedora 1.2 MiB python3-libs x86_64 3.13.3-2.fc42 updates 39.9 MiB readline x86_64 8.2-13.fc42 fedora 485.0 KiB tzdata noarch 2025b-1.fc42 fedora 1.6 MiB Installing weak dependencies: python-unversioned-command noarch 3.13.3-2.fc42 updates 23.0 B
Transaction Summary: Installing: 10 packages
Total size of inbound packages is 12 MiB. Need to download 12 MiB. After this operation, 44 MiB extra will be used (install 44 MiB, remove 0 B). [ 1/10] python3-0:3.13.3-2.fc42.x86_64 100% | 109.6 KiB/s | 29.7 KiB | 00m00s [...] [12/12] Installing python-unversioned-c 100% | 9.6 KiB/s | 424.0 B | 00m00s Complete! COMMIT --> edfb5d3fee4c edfb5d3fee4c729c0ec373150bd382e5a8461bc6ce18b14bcc12606d65ee185f
$ ps auxZ | grep pasta # In another terminal while the above is running user_u:user_r:container_runtime_t:s0-s0:c0.c1023 test-us+ 497555 0.4 0.1 2533448 48028 pts/2 Sl+ 06:11 0:00 podman build --no-cache --network=pasta . user_u:user_r:pasta_t:s0-s0:c0.c1023 test-us+ 497680 1.1 0.0 206444 17188 ? Ss 06:11 0:00 /usr/sbin/pasta --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /proc/497672/ns/net --map-guest-addr 169.254.1.2
What are the SELinux contexts of the network namespaces? This is what I get:
$ ls -laZ $XDG_RUNTIME_DIR/netns $XDG_RUNTIME_DIR/containers/networks/rootless-netns /proc/self/ns/net ls: cannot access '/run/user/959/netns': No such file or directory lrwxrwxrwx. 1 test-user test-user user_u:user_r:user_t:s0-s0:c0.c1023 0 May 16 06:15 /proc/self/ns/net -> 'net:[4026531840]'
It seems to be unconfined for me lrwxrwxrwx. 1 test test unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 0 May 16 08:32 /proc/self/ns/net -> 'net:[4026531840]'
/run/user/959/containers/networks/rootless-netns: total 0 drwx------. 2 test-user test-user user_u:object_r:ifconfig_var_run_t:s0 40 May 16 06:05 ./ drwx------. 3 test-user test-user user_u:object_r:user_tmp_t:s0 60 May 16 06:05 ../
/run/user/1001/containers/networks/rootless-netns: total 0 drwx------. 2 test test unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 16 06:26 . drwx------. 4 test test unconfined_u:object_r:user_tmp_t:s0 120 May 16 06:26 .. /run/user/1001/netns: total 0 drwxr-xr-x. 2 test test unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 16 07:31 . drwx------. 9 test test unconfined_u:object_r:user_tmp_t:s0 200 May 16 06:19 ..
I am not familiar with the selinux stuff but if this is a boolean that users can configure should this be documented in the man page here? I guess more documentation is always a good thing, but most of the other container-related SELinux booleans seem to be undocumented:
$ sudo semanage boolean --list | grep ^container_ container_connect_any (off , off) Determine whether container can connect to all TCP ports. container_manage_cgroup (on , on) Allow sandbox containers to manage cgroup (systemd) container_read_certs (off , off) Allow all container domains to read cert files and directories container_use_cephfs (off , off) Determine whether container can use ceph file system container_use_devices (off , off) Allow containers to use any device volume mounted into container container_use_dri_devices (on , on) Allow containers to use any dri device volume mounted into container container_use_ecryptfs (off , off) Determine whether container can use ecrypt file system container_use_xserver_devices (off , off) Allow containers to use any xserver device volume mounted into container, mostly used for GPU acceleration container_user_exec_content (on , on) Allow container to user exec content
$ man -wK container_connect_any No manual entry for container_connect_any
$ man -wK container_manage_cgroup /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man7/podman-troubleshooting.7.gz
$ man -wK container_read_certs No manual entry for container_read_certs
$ man -wK container_use_cephfs No manual entry for container_use_cephfs
$ man -wK container_use_devices /usr/share/man/man1/sesearch.1.gz /usr/share/man/man1/podman-pod-clone.1.gz /usr/share/man/man1/podman-pod-create.1.gz /usr/share/man/man1/podman-build.1.gz /usr/share/man/man1/podman-farm-build.1.gz /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man8/setsebool.8.gz
$ man -wK container_user_exec_content No manual entry for container_user_exec_content
I'll send a patch for the man pages tomorrow.
Thanks, -- Max
-- Paul Holzinger
On Fri, 16 May 2025 14:35:14 +0200
Paul Holzinger
On 16/05/2025 14:22, Max Chernoff wrote:
Hi Paul,
On Fri, 2025-05-16 at 13:59 +0200, Paul Holzinger wrote:
So I did test this patch with podman's system and e2e test on podman v5.5.0 on fedora rawhide and I noticed one problem that caused some failures:
podman build is broken with this policy. And I assume that means buildah would not work as well. The difference is that in the build case we do not pass a bind mounted namespace path under /run but rather /proc/$pid/ns/net as path to pasta. We get this error:
pasta failed with exit code 1: Couldn't open network namespace /proc/360143/ns/net: Permission denied
Logged avc: denied { search } for pid=360144 comm="pasta.avx2" name="360143" dev="proc" ino=2030208 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=0 Odd, it works for me:
$ id -Z user_u:user_r:user_t:s0-s0:c0.c1023
$ podman --version podman version 5.4.2
$ pasta --version pasta 0^20250512.g8ec1341-1.fc42.x86_64 Copyright Red Hat GNU General Public License, version 2 or later https://www.gnu.org/licenses/old-licenses/gpl-2.0.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
$ cat Containerfile FROM registry.fedoraproject.org/fedora-minimal:42 RUN dnf install --assumeyes python3
$ podman build --no-cache --network=pasta . STEP 1/2: FROM registry.fedoraproject.org/fedora-minimal:42 STEP 2/2: RUN dnf install --assumeyes python3 Updating and loading repositories: Fedora 42 - x86_64 - Updates 100% | 8.3 MiB/s | 6.8 MiB | 00m01s Fedora 42 openh264 (From Cisco) - x86_ 100% | 7.7 KiB/s | 6.0 KiB | 00m01s Fedora 42 - x86_64 100% | 12.3 MiB/s | 35.4 MiB | 00m03s Repositories loaded. Package Arch Version Repository Size Installing: python3 x86_64 3.13.3-2.fc42 updates 28.7 KiB Installing dependencies: expat x86_64 2.7.1-1.fc42 fedora 290.2 KiB libb2 x86_64 0.98.1-13.fc42 fedora 46.1 KiB libgomp x86_64 15.1.1-1.fc42 updates 538.5 KiB mpdecimal x86_64 4.0.1-1.fc42 updates 217.2 KiB python-pip-wheel noarch 24.3.1-2.fc42 fedora 1.2 MiB python3-libs x86_64 3.13.3-2.fc42 updates 39.9 MiB readline x86_64 8.2-13.fc42 fedora 485.0 KiB tzdata noarch 2025b-1.fc42 fedora 1.6 MiB Installing weak dependencies: python-unversioned-command noarch 3.13.3-2.fc42 updates 23.0 B
Transaction Summary: Installing: 10 packages
Total size of inbound packages is 12 MiB. Need to download 12 MiB. After this operation, 44 MiB extra will be used (install 44 MiB, remove 0 B). [ 1/10] python3-0:3.13.3-2.fc42.x86_64 100% | 109.6 KiB/s | 29.7 KiB | 00m00s [...] [12/12] Installing python-unversioned-c 100% | 9.6 KiB/s | 424.0 B | 00m00s Complete! COMMIT --> edfb5d3fee4c edfb5d3fee4c729c0ec373150bd382e5a8461bc6ce18b14bcc12606d65ee185f
$ ps auxZ | grep pasta # In another terminal while the above is running user_u:user_r:container_runtime_t:s0-s0:c0.c1023 test-us+ 497555 0.4 0.1 2533448 48028 pts/2 Sl+ 06:11 0:00 podman build --no-cache --network=pasta . user_u:user_r:pasta_t:s0-s0:c0.c1023 test-us+ 497680 1.1 0.0 206444 17188 ? Ss 06:11 0:00 /usr/sbin/pasta --config-net --dns-forward 169.254.1.1 -t none -u none -T none -U none --no-map-gw --quiet --netns /proc/497672/ns/net --map-guest-addr 169.254.1.2
What are the SELinux contexts of the network namespaces? This is what I get:
$ ls -laZ $XDG_RUNTIME_DIR/netns $XDG_RUNTIME_DIR/containers/networks/rootless-netns /proc/self/ns/net ls: cannot access '/run/user/959/netns': No such file or directory lrwxrwxrwx. 1 test-user test-user user_u:user_r:user_t:s0-s0:c0.c1023 0 May 16 06:15 /proc/self/ns/net -> 'net:[4026531840]'
It seems to be unconfined for me
lrwxrwxrwx. 1 test test unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 0 May 16 08:32 /proc/self/ns/net -> 'net:[4026531840]'
/run/user/959/containers/networks/rootless-netns: total 0 drwx------. 2 test-user test-user user_u:object_r:ifconfig_var_run_t:s0 40 May 16 06:05 ./ drwx------. 3 test-user test-user user_u:object_r:user_tmp_t:s0 60 May 16 06:05 ../
/run/user/1001/containers/networks/rootless-netns: total 0 drwx------. 2 test test unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 16 06:26 . drwx------. 4 test test unconfined_u:object_r:user_tmp_t:s0 120 May 16 06:26 ..
/run/user/1001/netns: total 0 drwxr-xr-x. 2 test test unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 16 07:31 . drwx------. 9 test test unconfined_u:object_r:user_tmp_t:s0 200 May 16 06:19 ..
I'm getting the same issue with 'podman build' and the Containerfile shared by Max. Running with SELinux in permissive mode, I'm getting: # cat /var/log/audit/audit.log type=AVC msg=audit(1747410763.621:130615): avc: denied { search } for pid=1352409 comm="pasta.avx2" name="1352408" dev="proc" ino=7022238 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410763.621:130616): avc: denied { read } for pid=1352409 comm="pasta.avx2" name="net" dev="proc" ino=7022285 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=lnk_file permissive=1 type=AVC msg=audit(1747410763.622:130617): avc: denied { read } for pid=1352409 comm="pasta.avx2" scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=file permissive=1 type=AVC msg=audit(1747410763.622:130618): avc: denied { read } for pid=1352409 comm="pasta.avx2" name="ns" dev="proc" ino=7022284 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410763.622:130619): avc: denied { open } for pid=1352409 comm="pasta.avx2" path="/proc/1352408/ns" dev="proc" ino=7022284 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=dir permissive=1 type=AVC msg=audit(1747410764.622:130620): avc: denied { read } for pid=1352417 comm="pasta.avx2" name="net" dev="proc" ino=7022285 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=system_u:system_r:container_t:s0:c609,c838 tclass=lnk_file permissive=1 and: # audit2allow -a #============= pasta_t ============== allow pasta_t container_runtime_t:dir { open read search }; allow pasta_t container_runtime_t:file read; allow pasta_t container_runtime_t:lnk_file read; allow pasta_t container_t:lnk_file read; If I add those rules, everything works (well, I'm not saying that's the solution...). This is a Fedora virtual machine with: # uname -a Linux passt.top 6.11.0-0.rc3.30.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Aug 12 14:18:21 UTC 2024 x86_64 GNU/Linux # rpm -qe podman passt podman-5.5.0~rc2-1.fc43.x86_64 passt-0^20250512.g8ec1341-1.fc43.x86_64 To me those denials look reasonable, in the sense that I would expect the namespace links to have container_runtime_t type. By the way: $ ls -laZ $XDG_RUNTIME_DIR/netns $XDG_RUNTIME_DIR/containers/networks/rootless-netns /proc/self/ns/net lrwxrwxrwx. 1 sbrivio sbrivio unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 0 May 16 15:59 /proc/self/ns/net -> 'net:[4026531840]' /run/user/1001/containers/networks/rootless-netns: total 0 drwx------. 2 sbrivio sbrivio unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 15 15:00 . drwx------. 4 sbrivio sbrivio unconfined_u:object_r:user_tmp_t:s0 120 May 15 15:00 .. /run/user/1001/netns: total 0 drwxr-xr-x. 2 sbrivio sbrivio unconfined_u:object_r:ifconfig_var_run_t:s0 40 May 15 15:00 . drwx------. 8 sbrivio sbrivio unconfined_u:object_r:user_tmp_t:s0 220 May 6 08:02 .. Max, could it be that you're running stuff with some customised SELinux policy? By the way, with "unconfined disabled": https://bugzilla.redhat.com/show_bug.cgi?id=2330512 we seem to have unconfined_t as type for those links: type=AVC msg=audit(1733378482.320:31258): avc: denied { open } for pid=651955 comm="pasta.avx2" path="/proc/651954/ns" dev="proc" ino=2904841 scontext=staff_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=dir permissive=1 ...but I'm not sure at which point in time exactly.
I am not familiar with the selinux stuff but if this is a boolean that users can configure should this be documented in the man page here? I guess more documentation is always a good thing, but most of the other container-related SELinux booleans seem to be undocumented:
$ sudo semanage boolean --list | grep ^container_ container_connect_any (off , off) Determine whether container can connect to all TCP ports. container_manage_cgroup (on , on) Allow sandbox containers to manage cgroup (systemd) container_read_certs (off , off) Allow all container domains to read cert files and directories container_use_cephfs (off , off) Determine whether container can use ceph file system container_use_devices (off , off) Allow containers to use any device volume mounted into container container_use_dri_devices (on , on) Allow containers to use any dri device volume mounted into container container_use_ecryptfs (off , off) Determine whether container can use ecrypt file system container_use_xserver_devices (off , off) Allow containers to use any xserver device volume mounted into container, mostly used for GPU acceleration container_user_exec_content (on , on) Allow container to user exec content
$ man -wK container_connect_any No manual entry for container_connect_any
$ man -wK container_manage_cgroup /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man7/podman-troubleshooting.7.gz
$ man -wK container_read_certs No manual entry for container_read_certs
$ man -wK container_use_cephfs No manual entry for container_use_cephfs
$ man -wK container_use_devices /usr/share/man/man1/sesearch.1.gz /usr/share/man/man1/podman-pod-clone.1.gz /usr/share/man/man1/podman-pod-create.1.gz /usr/share/man/man1/podman-build.1.gz /usr/share/man/man1/podman-farm-build.1.gz /usr/share/man/man1/podman-create.1.gz /usr/share/man/man1/podman-run.1.gz /usr/share/man/man8/setsebool.8.gz
$ man -wK container_user_exec_content No manual entry for container_user_exec_content
I'll send a patch for the man pages tomorrow.
Wait a moment. I don't think something SELinux-specific belongs to pasta's man page, because that's not relevant for all users and distributions. We could maintain that as an addition for Fedora and perhaps Gentoo, but I wonder if it's really worth the effort. Besides, I think that: # semanage boolean --list | grep pasta pasta_allow_bind_any_port (on , on) Allow pasta to allow bind any port ...this is the common practice to document those knobs (and where I usually look for things). We wouldn't have much to add to this anyway. -- Stefano
participants (3)
-
Max Chernoff
-
Paul Holzinger
-
Stefano Brivio