In https://github.com/containers/podman/issues/19428, pasta is failing to open the namespace tap device. Paul Holzinger correctly noted that pasta isn't very helpful in this case, with no information beyond "it failed". He suggested a patch for that, however it wasn't quite sufficient: errno may not be propagated back from the ephemeral thread which enters the namespace, and even if it does the errno alone won't tell us which of the possible failure points actually failed. 2/3 here is a more robust change to address the problem. The other patches are minor cleanups I noticed along the way. Link: https://bugs.passt.top/show_bug.cgi?id=69 David Gibson (3): util: Make ns_enter() a void function and report setns() errors tap: More detailed error reporting in tap_ns_tun() tap: Remove unnecessary global tun_ns_fd conf.c | 3 ++- tap.c | 33 ++++++++++++++++++--------------- udp.c | 6 ++---- util.c | 8 +++----- util.h | 2 +- 5 files changed, 26 insertions(+), 26 deletions(-) -- 2.41.0
ns_enter() returns an integer... but it's always zero. If we actually fail the function doesn't return. Therefore it makes more sense for this to be a function returning void, and we can remove the cases where we pointlessly checked its return value. In addition ns_enter() is usually called from an ephemeral thread created by NS_CALL(). That means that the exit(EXIT_FAILURE) there usually won't be reported (since NS_CALL() doesn't wait() for the thread). So, use die() instead to print out some information in the unlikely event that our setns() here does fail. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- conf.c | 3 ++- tap.c | 4 ++-- udp.c | 6 ++---- util.c | 8 +++----- util.h | 2 +- 5 files changed, 10 insertions(+), 13 deletions(-) diff --git a/conf.c b/conf.c index 78eaf2d..a0622d2 100644 --- a/conf.c +++ b/conf.c @@ -101,9 +101,10 @@ static int get_bound_ports_ns(void *arg) struct get_bound_ports_ns_arg *a = (struct get_bound_ports_ns_arg *)arg; struct ctx *c = a->c; - if (!c->pasta_netns_fd || ns_enter(c)) + if (!c->pasta_netns_fd) return 0; + ns_enter(c); get_bound_ports(c, 1, a->proto); return 0; diff --git a/tap.c b/tap.c index a6a73d3..0f90cab 100644 --- a/tap.c +++ b/tap.c @@ -1182,9 +1182,9 @@ static int tap_ns_tun(void *arg) struct ctx *c = (struct ctx *)arg; memcpy(ifr.ifr_name, c->pasta_ifn, IFNAMSIZ); + ns_enter(c); - if (ns_enter(c) || - (tun_ns_fd = open("/dev/net/tun", flags)) < 0 || + if ((tun_ns_fd = open("/dev/net/tun", flags)) < 0 || ioctl(tun_ns_fd, TUNSETIFF, &ifr) || !(c->pasta_ifi = if_nametoindex(c->pasta_ifn))) { if (tun_ns_fd != -1) diff --git a/udp.c b/udp.c index 39c59d4..7be73f5 100644 --- a/udp.c +++ b/udp.c @@ -473,8 +473,7 @@ static int udp_splice_new_ns(void *arg) a = (struct udp_splice_new_ns_arg *)arg; - if (ns_enter(a->c)) - return 0; + ns_enter(a->c); a->s = udp_splice_new(a->c, a->v6, a->src, true); @@ -1068,8 +1067,7 @@ int udp_sock_init_ns(void *arg) struct ctx *c = (struct ctx *)arg; unsigned dst; - if (ns_enter(c)) - return 0; + ns_enter(c); for (dst = 0; dst < NUM_PORTS; dst++) { if (!bitmap_isset(c->udp.fwd_out.f.map, dst)) diff --git a/util.c b/util.c index 1d00404..2f9c27d 100644 --- a/util.c +++ b/util.c @@ -378,16 +378,14 @@ void procfs_scan_listen(struct ctx *c, uint8_t proto, int ip_version, int ns, * ns_enter() - Enter configured user (unless already joined) and network ns * @c: Execution context * - * Return: 0, won't return on failure + * Won't return on failure * * #syscalls:pasta setns */ -int ns_enter(const struct ctx *c) +void ns_enter(const struct ctx *c) { if (setns(c->pasta_netns_fd, CLONE_NEWNET)) - exit(EXIT_FAILURE); - - return 0; + die("setns() failed entering netns: %s", strerror(errno)); } /** diff --git a/util.h b/util.h index 26892aa..23dcad5 100644 --- a/util.h +++ b/util.h @@ -216,7 +216,7 @@ int bitmap_isset(const uint8_t *map, int bit); char *line_read(char *buf, size_t len, int fd); void procfs_scan_listen(struct ctx *c, uint8_t proto, int ip_version, int ns, uint8_t *map, uint8_t *exclude); -int ns_enter(const struct ctx *c); +void ns_enter(const struct ctx *c); bool ns_is_init(void); void write_pidfile(int fd, pid_t pid); int __daemon(int pidfile_fd, int devnull_fd); -- 2.41.0
There are several possible failure points in tap_ns_tun(), but if anything goes wrong, we just set tun_ns_fd to -1 resulting in the same error message. Add more detailed error reporting to the various failure points. At the same time, we know this is only called from tap_sock_tun_init() which will terminate pasta if we fail, so we can simplify things a little because we don't need to close() the fd on the failure paths. Link: https://bugs.passt.top/show_bug.cgi?id=69 Link: https://github.com/containers/podman/issues/19428 Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tap.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/tap.c b/tap.c index 0f90cab..a5d357a 100644 --- a/tap.c +++ b/tap.c @@ -1171,7 +1171,7 @@ static int tun_ns_fd = -1; * tap_ns_tun() - Get tuntap fd in namespace * @c: Execution context * - * Return: 0 + * Return: 0 on success, exits on failure * * #syscalls:pasta ioctl openat */ @@ -1180,17 +1180,24 @@ static int tap_ns_tun(void *arg) struct ifreq ifr = { .ifr_flags = IFF_TAP | IFF_NO_PI }; int flags = O_RDWR | O_NONBLOCK | O_CLOEXEC; struct ctx *c = (struct ctx *)arg; + int fd, rc; + tun_ns_fd = -1; memcpy(ifr.ifr_name, c->pasta_ifn, IFNAMSIZ); ns_enter(c); - if ((tun_ns_fd = open("/dev/net/tun", flags)) < 0 || - ioctl(tun_ns_fd, TUNSETIFF, &ifr) || - !(c->pasta_ifi = if_nametoindex(c->pasta_ifn))) { - if (tun_ns_fd != -1) - close(tun_ns_fd); - tun_ns_fd = -1; - } + fd = open("/dev/net/tun", flags); + if (fd < 0) + die("Failed to open() /dev/net/tun: %s", strerror(errno)); + + rc = ioctl(fd, TUNSETIFF, &ifr); + if (rc < 0) + die("TUNSETIFF failed: %s", strerror(errno)); + + if (!(c->pasta_ifi = if_nametoindex(c->pasta_ifn))) + die("Tap device opened but no network interface found"); + + tun_ns_fd = fd; return 0; } @@ -1205,7 +1212,7 @@ static void tap_sock_tun_init(struct ctx *c) NS_CALL(tap_ns_tun, c); if (tun_ns_fd == -1) - die("Failed to open tun socket in namespace"); + die("Failed to set up tap device in namespace"); pasta_ns_conf(c); -- 2.41.0
tap_ns_tun(), which runs in an ephemeral thread puts the fd it opens into the global variable tun_ns_fd to communicate it back to the main thread in tap_sock_tun_init(). However, the only thing tap_sock_tun_init() does with it is copies it to c->fd_tap and everything else uses it from there. tap_ns_tun() already has access to the context structure, so we might as well store the value directly in there rather than having a global as an intermediate. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tap.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/tap.c b/tap.c index a5d357a..e034f94 100644 --- a/tap.c +++ b/tap.c @@ -1165,8 +1165,6 @@ static void tap_sock_unix_new(struct ctx *c) epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); } -static int tun_ns_fd = -1; - /** * tap_ns_tun() - Get tuntap fd in namespace * @c: Execution context @@ -1182,7 +1180,7 @@ static int tap_ns_tun(void *arg) struct ctx *c = (struct ctx *)arg; int fd, rc; - tun_ns_fd = -1; + c->fd_tap = -1; memcpy(ifr.ifr_name, c->pasta_ifn, IFNAMSIZ); ns_enter(c); @@ -1197,7 +1195,7 @@ static int tap_ns_tun(void *arg) if (!(c->pasta_ifi = if_nametoindex(c->pasta_ifn))) die("Tap device opened but no network interface found"); - tun_ns_fd = fd; + c->fd_tap = fd; return 0; } @@ -1211,13 +1209,11 @@ static void tap_sock_tun_init(struct ctx *c) struct epoll_event ev = { 0 }; NS_CALL(tap_ns_tun, c); - if (tun_ns_fd == -1) + if (c->fd_tap == -1) die("Failed to set up tap device in namespace"); pasta_ns_conf(c); - c->fd_tap = tun_ns_fd; - ev.data.fd = c->fd_tap; ev.events = EPOLLIN | EPOLLRDHUP; epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); -- 2.41.0
On Wed, 2 Aug 2023 13:15:39 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:In https://github.com/containers/podman/issues/19428, pasta is failing to open the namespace tap device. Paul Holzinger correctly noted that pasta isn't very helpful in this case, with no information beyond "it failed". He suggested a patch for that, however it wasn't quite sufficient: errno may not be propagated back from the ephemeral thread which enters the namespace, and even if it does the errno alone won't tell us which of the possible failure points actually failed. 2/3 here is a more robust change to address the problem. The other patches are minor cleanups I noticed along the way. Link: https://bugs.passt.top/show_bug.cgi?id=69 David Gibson (3): util: Make ns_enter() a void function and report setns() errors tap: More detailed error reporting in tap_ns_tun() tap: Remove unnecessary global tun_ns_fdApplied, too (no, I didn't forget about the flow tracking series :) I'm still reviewing it). -- Stefano
On Fri, Aug 04, 2023 at 09:04:46AM +0200, Stefano Brivio wrote:On Wed, 2 Aug 2023 13:15:39 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:Fair enough. It's probably the most conceptually complex of all the outstanding series. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonIn https://github.com/containers/podman/issues/19428, pasta is failing to open the namespace tap device. Paul Holzinger correctly noted that pasta isn't very helpful in this case, with no information beyond "it failed". He suggested a patch for that, however it wasn't quite sufficient: errno may not be propagated back from the ephemeral thread which enters the namespace, and even if it does the errno alone won't tell us which of the possible failure points actually failed. 2/3 here is a more robust change to address the problem. The other patches are minor cleanups I noticed along the way. Link: https://bugs.passt.top/show_bug.cgi?id=69 David Gibson (3): util: Make ns_enter() a void function and report setns() errors tap: More detailed error reporting in tap_ns_tun() tap: Remove unnecessary global tun_ns_fdApplied, too (no, I didn't forget about the flow tracking series :) I'm still reviewing it).