On Thu, 15 May 2025 15:40:35 +0200
Stefano Brivio
On Wed, 14 May 2025 04:44:12 -0600 Max Chernoff
wrote: Currently, pasta runs in the container_runtime_exec_t context when running in a container. This is not ideal since it means that pasta runs with more privileges than strictly necessary. This commit updates the SELinux policy to have pasta transition to the pasta_t context when started from the container_runtime_t context, adds the appropriate labels to $XDG_RUNTIME_DIR/netns and $XDG_RUNTIME_DIR/containers/networks/rootless-netns, and grants the necessary permissions to the pasta_t context.
Link: https://bugs.passt.top/show_bug.cgi?id=81 Link: https://github.com/containers/podman/discussions/26100#discussioncomment-130... Signed-off-by: Max Chernoff
Thanks, I think that with your patch we're almost there. (!)
I ran Podman tests covering pasta on Fedora Rawhide, with the updated profile (that is, 'bats test/system/505-networking-pasta.bats' from a Podman tree) and it looks like there are a couple of minor things missing, though.
Tests pass, but on a number of tests I'm getting these in the audit log:
type=AVC msg=audit(1747313163.407:129988): avc: denied { nlmsg_read } for pid=1313607 comm="ss" scontext=system_u:system_r:container_t:s0:c752,c999 tcontext=system_u:system_r:container_t:s0:c752,c999 tclass=netlink_tcpdiag_socket permissive=0 type=AVC msg=audit(1747313164.090:129989): avc: denied { getattr } for pid=1313686 comm="pasta.avx2" path="pipe:[6839919]" dev="pipefs" ino=6839919 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=fifo_file permissive=0 type=AVC msg=audit(1747313164.209:129990): avc: denied { getattr } for pid=1313714 comm="pasta.avx2" path="pipe:[6840012]" dev="pipefs" ino=6840012 scontext=unconfined_u:unconfined_r:pasta_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:container_runtime_t:s0-s0:c0.c1023 tclass=fifo_file permissive=0
The 'ss' thing is unrelated, and might be something to add to container-selinux, perhaps. I'm not really sure if containers should reasonably be able to access netlink_tcpdiag_socket.
The getattr on pipes, though, is pasta trying to read out attributes of pipes that are used for loopback connections, that is, the path represented here (orange square on top) as "tap bypass":
https://passt.top/#pasta-pack-a-subtle-tap-abstraction
if those fail, by the way, things still work (I guess it's just what we do to probe / tune the size of the pipes).
A summary from audit2allow:
#============= container_t ==============
#!!!! This avc can be allowed using the boolean 'virt_sandbox_use_netlink' allow container_t self:netlink_tcpdiag_socket nlmsg_read;
#============= pasta_t ============== allow pasta_t container_runtime_t:fifo_file getattr;
I plan to try again later (probably in a few hours) to add what's missing (it could very well be just this rule) and get back to you. Of course, if you manage to fix / re-test meanwhile, before I get to it, feel free to re-post this.
Yes, adding getattr on fifo_file makes the tests pass without any SELinux warning. Full review of your patch:
diff --git a/contrib/selinux/pasta.fc b/contrib/selinux/pasta.fc index 41ee46d..3be7789 100644 --- a/contrib/selinux/pasta.fc +++ b/contrib/selinux/pasta.fc @@ -8,7 +8,9 @@ # Copyright (c) 2022 Red Hat GmbH # Author: Stefano Brivio
-/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 -/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 -/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 -/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/usr/bin/pasta system_u:object_r:pasta_exec_t:s0 +/usr/bin/pasta.avx2 system_u:object_r:pasta_exec_t:s0 +/tmp/pasta\.pcap system_u:object_r:pasta_log_t:s0 +/var/run/pasta\.pid system_u:object_r:pasta_pid_t:s0 +/run/user/%{USERID}/netns system_u:object_r:ifconfig_var_run_t:s0 +/run/user/%{USERID}/containers/networks/rootless-netns system_u:object_r:ifconfig_var_run_t:s0 diff --git a/contrib/selinux/pasta.te b/contrib/selinux/pasta.te index 89c8043..e97fd88 100644 --- a/contrib/selinux/pasta.te +++ b/contrib/selinux/pasta.te @@ -89,6 +89,13 @@ require { class capability { sys_tty_config setuid setgid }; class cap_userns { setpcap sys_admin sys_ptrace net_bind_service net_admin }; class user_namespace create; + + # Container requires + attribute_role usernetctl_roles; + role container_user_r; + role staff_r; + role user_r; + type container_runtime_t; }
type pasta_t; @@ -213,3 +220,32 @@ allow pasta_t netutils_t:process { noatsecure rlimitinh siginh }; allow pasta_t ping_t:process { noatsecure rlimitinh siginh }; allow pasta_t user_tty_device_t:chr_file { append read write }; allow pasta_t user_devpts_t:chr_file { append read write }; + +# Allow network administration commands for non-privileged users +roleattribute container_user_r usernetctl_roles; +roleattribute staff_r usernetctl_roles; +roleattribute user_r usernetctl_roles; +role usernetctl_roles types pasta_t; + +# Make pasta in a container run under the pasta_t context +type_transition container_runtime_t pasta_exec_t : process pasta_t; +allow container_runtime_t pasta_t:process transition; + +# Label the user network namespace files +type_transition container_runtime_t user_tmp_t : dir ifconfig_var_run_t "netns"; +type_transition container_runtime_t user_tmp_t : dir ifconfig_var_run_t "rootless-netns"; +allow pasta_t ifconfig_var_run_t:dir { add_name open rmdir write }; +allow pasta_t ifconfig_var_run_t:file { create open write }; + +# From audit2allow
Instead of these three "unsorted" rules:
+allow pasta_t container_runtime_t:fifo_file write;
...as I mentioned, changing this to: allow pasta_t container_runtime_t:fifo_file { write getattr }; fixes the remaining warning. And I think it should be "grouped" together with the TCP socket stuff above, that is, just after: corenet_tcp_bind_generic_node(pasta_t) because it's something we need for (loopback) TCP connections, together with TCP sockets.
+allow pasta_t self:cap_userns { setgid setuid };
Strictly speaking, this part shouldn't be needed, see points 7. and c. at: https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10 ...unfortunately, I never got any feedback about those and I haven't found the time to fix this in kernel either, so, sure, let's keep this rule to avoid noise. We could group this together with capabilities stuff, that is, just after: allow pasta_t self:cap_userns { setpcap sys_admin sys_ptrace net_admin net_bind_service }; (but separated, so that we can drop them without code churn) and maybe add a comment referencing: https://bugzilla.redhat.com/show_bug.cgi?id=2330512#c10 and the fact that setuid() and setgid() are always called with the current UID and GID in the detached user namespace.
+allow pasta_t tmpfs_t:filesystem getattr;
This is needed regardless of Podman, getattr was simply missing from: allow pasta_t tmpfs_t:filesystem mount; so I would rather add it there, together with mount.
+ +# Allow pasta to bind to any port +bool pasta_allow_bind_any_port true; +if (pasta_allow_bind_any_port) { + allow pasta_t port_type:icmp_socket { accept getopt name_bind }; + allow pasta_t port_type:tcp_socket { accept getopt name_bind name_connect }; + allow pasta_t port_type:udp_socket { accept getopt name_bind }; +}
Everything else looks good to me! If you want to re-post this, you can give --subject-prefix="PATCH v2" to git format-email. -- Stefano