On Mon, Feb 13, 2023 at 02:13:19AM +0100, Stefano Brivio wrote:Adapted from a patch by Paul Holzinger: when pasta spawns a command, operating without a pre-existing user and network namespace, it needs to wait for the tap device to be configured and its handler ready, before the command is actually executed. Otherwise, something like: pasta --config-net nslookup passt.top usually fails as the nslookup command is issued before the network interface is ready. We can't adopt a simpler approach based on SIGSTOP and SIGCONT here: the child runs in a separate PID namespace, so it can't send SIGSTOP to itself as the kernel sees the child as init process and blocks the delivery of the signal. We could send SIGSTOP from the parent, but this wouldn't avoid the possible condition where the child isn't ready to wait for it when the parent sends it, also raised by Paul -- and SIGSTOP can't be blocked, so it can never be pending. Use SIGUSR1 instead: mask it before clone(), so that the child starts with it blocked, and can safely wait for it. Once the parent is ready, it sends SIGUSR1 to the child. If SIGUSR1 is sent before the child is waiting for it, the kernel will queue it for us, because it's blocked. Reported-by: Paul Holzinger <pholzing(a)redhat.com> Fixes: 1392bc5ca002 ("Allow pasta to take a command to execute") Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- I just applied this, to keep things simpler for myself and for tests, given that Paul already had a look. I'm posting it anyway for reviews. passt.c | 3 +++ pasta.c | 14 +++++++++++++- pasta.h | 2 ++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/passt.c b/passt.c index 8b2c50d..d957e14 100644 --- a/passt.c +++ b/passt.c @@ -301,6 +301,9 @@ int main(int argc, char **argv) else write_pidfile(pidfile_fd, getpid()); + if (pasta_child_pid) + kill(pasta_child_pid, SIGUSR1); + isolate_postfork(&c); timer_init(&c, &now); diff --git a/pasta.c b/pasta.c index 528f02a..9169913 100644 --- a/pasta.c +++ b/pasta.c @@ -47,7 +47,7 @@ #include "log.h" /* PID of child, in case we created a namespace */ -static int pasta_child_pid; +int pasta_child_pid; /** * pasta_child_handler() - Exit once shell exits (if we started it), reap clones @@ -166,10 +166,16 @@ struct pasta_spawn_cmd_arg { static int pasta_spawn_cmd(void *arg) { const struct pasta_spawn_cmd_arg *a; + sigset_t set; if (write_file("/proc/sys/net/ipv4/ping_group_range", "0 0")) warn("Cannot set ping_group_range, ICMP requests might fail"); + /* Wait for the parent to be ready: see main() */ + sigemptyset(&set); + sigaddset(&set, SIGUSR1); + sigwaitinfo(&set, NULL); + a = (const struct pasta_spawn_cmd_arg *)arg; execvp(a->exe, a->argv); @@ -196,6 +202,7 @@ void pasta_start_ns(struct ctx *c, uid_t uid, gid_t gid, char ns_fn_stack[NS_FN_STACK_SIZE]; char *sh_argv[] = { NULL, NULL }; char sh_arg0[PATH_MAX + 1]; + sigset_t set; c->foreground = 1; if (!c->debug) @@ -226,6 +233,11 @@ void pasta_start_ns(struct ctx *c, uid_t uid, gid_t gid, arg.argv = sh_argv; } + /* Block SIGUSR1 in child, we queue it in main() when we're ready */ + sigemptyset(&set); + sigaddset(&set, SIGUSR1); + sigprocmask(SIG_BLOCK, &set, NULL); + pasta_child_pid = do_clone(pasta_spawn_cmd, ns_fn_stack, sizeof(ns_fn_stack), CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWNET | diff --git a/pasta.h b/pasta.h index a8b9893..0ccb7e9 100644 --- a/pasta.h +++ b/pasta.h @@ -6,6 +6,8 @@ #ifndef PASTA_H #define PASTA_H +extern int pasta_child_pid; + void pasta_open_ns(struct ctx *c, const char *netns); void pasta_start_ns(struct ctx *c, uid_t uid, gid_t gid, int argc, char *argv[]);-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson