[Cc'ing Paul as Podman maintainer, thread at:
https://archives.passt.top/passt-user/671252c8-88f6-45b7-b719-b82786e84bb7@g...]
On Sun, 6 Jul 2025 19:08:46 +0200
Lisa Gnedt via user
Hi,
On 2025-07-06 17:15, Lisa Gnedt wrote:
It might be easier to get it correct when directly controlling all syscalls involved and not have to mix and match multiple tools. Since Linux 4.9 it seems to be possible to get the owning user namespace of a network namespace with the ioctl NS_GET_USERNS [3].
I just wrote a hacky patch as proof-of-concept of this idea. It is working for me fine in both testcases. However, in its current form it breaks the --userns parameter. But it should not be too hard to address this issue.
I am not sure, what kernel version compatibility you are targeting, since the ioctl is only available since Linux 4.9. Would it be an option for you to make it the default behavior when a PID is specified?
From my perspective this should be the expected behavior and should not break any previously working use case.
I finally had a second look, a bit quicker than I wanted but I think I grasped the issue. For context, "this" is: always join the user namespace owning a network namespace. It looks reasonable (and desirable) to me, but I'm not sure how / why it breaks the --userns parameter. We should probably never do this when --netns-only is given (that's Podman's case, for example). It would be good to have a way to "cleanly" exclude this new behaviour, but, once we add the NS_GET_USERNS trick, --netns-only doesn't exactly get us back to the previous behaviour. What about --userns-from-pid or something like that? That name isn't great though. Now, 4.9 feels "old" enough, but pasta used to run on a 3.13 kernel a while ago, then a few things were (inadvertently) broken. But it "almost" does. Couldn't we just add a fallback for the case where NS_GET_USERNS fails? You're already handling the error. You could just print a warning and continue instead of calling die_perror()...
diff --git a/conf.c b/conf.c index 36845e2..cd67e7a 100644 --- a/conf.c +++ b/conf.c @@ -642,7 +642,7 @@ static void conf_pasta_ns(int *netns_only, char *userns, char *netns,
if (!*userns) { if (snprintf_check(userns, PATH_MAX, - "/proc/%ld/ns/user", pidval)) + "/proc/%ld/ns/net", pidval)) die_perror("Can't build userns path"); } } diff --git a/isolation.c b/isolation.c index bbcd23b..cbfe0f0 100644 --- a/isolation.c +++ b/isolation.c @@ -81,6 +81,7 @@ #include
#include #include +#include #include #include "util.h" @@ -254,6 +255,14 @@ void isolate_user(uid_t uid, gid_t gid, bool use_userns, const char *userns, if (ufd < 0) die_perror("Couldn't open user namespace %s", userns);
+ int real_ufd; + real_ufd = ioctl(ufd, NS_GET_USERNS); + if (real_ufd < 0) + die_perror("Couldn't get user namespace from network namespace %s", userns); + + close(ufd); + ufd = real_ufd; + if (setns(ufd, CLONE_NEWUSER) != 0) die_perror("Couldn't enter user namespace %s", userns);
-- Stefano