We watch network namespace entries to detect when we should quit
(unless --no-netns-quit is passed), and these might stored in a tmpfs
typically mounted at /run/user/UID or /var/run/user/UID, or found in
procfs at /proc/PID/ns/.
Currently, we try to use inotify for any possible location of those
entries, but inotify, of course, doesn't work on pseudo-filesystems
(see inotify(7)).
The man page reflects this: the description of --no-netns-quit
implies that we won't quit anyway if the namespace is not "bound to
the filesystem".
Well, we won't quit, but, since commit 9e0dbc894813 ("More
deterministic detection of whether argument is a PID, PATH or NAME"),
we try. And, indeed, this is harmless, as the caveat from that
commit message states.
Now, it turns out that Buildah, a tool to create container images,
sharing its codebase with Podman, passes a procfs entry to pasta, and
expects pasta to exit once the network namespace is not needed
anymore, that is, once the original container process, also spawned
by Buildah, terminates.
Get this to work by using the timer fallback mechanism if the
namespace name is passed as a path belonging to a pseudo-filesystem.
This is expected to be procfs, but I covered sysfs and devpts
pseudo-filesystems as well, because nothing actually prevents
creating this kind of directory structure and links there.
Note that fstatfs(), according to some versions of man pages, was
apparently "deprecated" by the LSB. My reasoning for using it is
essentially this:
https://lore.kernel.org/linux-man/f54kudgblgk643u32tb6at4cd3kkzha6hslahv24s…
...that is, there was no such thing as an LSB deprecation, and
anyway there's no other way to get the filesystem type.
Also note that, while it might sound more obvious to detect the
filesystem type using fstatfs() on the file descriptor itself
(c->pasta_netns_fd), the reported filesystem type for it is nsfs, no
matter what path was given to pasta. If we use the parent directory,
we'll typically have either tmpfs or procfs reported.
If the target namespace is given as a PID, or as a PID-based procfs
entry, we don't risk races if this PID is recycled: our handle on
/proc/PID/ns will always refer to the original namespace associated
with that PID, and we don't re-open this entry from procfs to check
it.
There's, however, a remaining race possibility if the parent process
is not the one associated to the network namespace we operate on: in
that case, the parent might pass a procfs entry associated to a PID
that was recycled by the time we parse it. This can't happen if the
namespace PID matches the one of the parent, because we detach from
the controlling terminal after parsing the namespace reference.
To avoid this type of race, if desired, we could add the option for
the parent to pass a PID file descriptor, that the parent obtained
via pidfd_open(). This is beyond the scope of this change.
Update the man page to reflect that, even if the target network
namespace is passed as a procfs path or a PID, we'll now quit when
the procfs entry is gone.
Reported-by: Paul Holzinger <pholzing(a)redhat.com>
Link: https://github.com/containers/podman/pull/21563#issuecomment-1948200214
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
I'm pushing this in a bit as Podman's CI needs it soon, but posting
anyway to ease later reviews.
v4: Fix some further paragraphs in the commit message. Make man page
change hopefully easier to follow.
v3: Given that we now open c->netns_dir before checking the
filesystem type, we could as well pass this file descriptor to
fstatfs() to do the check, instead of statfs() on the path.
Fix a couple of paragraphs in the commit message.
v2: Coverity Scan isn't happy if we "check" (kind of) c->netns_dir
with statfs() before opening it in a non-atomic way. Just to make
things clear, false positive or not: open it, check it, close it
if it wasn't needed: we don't rely on the check.
passt.1 | 8 ++++++--
pasta.c | 24 +++++++++++++++++++-----
2 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/passt.1 b/passt.1
index dc2d719..e97ef73 100644
--- a/passt.1
+++ b/passt.1
@@ -550,8 +550,12 @@ without \-\-userns.
.TP
.BR \-\-no-netns-quit
-If the target network namespace is bound to the filesystem (that is, if PATH or
-NAME are given as target), do not exit once the network namespace is deleted.
+Do not exit once the target namespace reference is removed.
+
+Without this option, \fBpasta\fR will terminate if the target network namespace
+is bound to the filesystem, and the given path is deleted, or if the target
+network namespace is represented by a procfs entry, and that entry is deleted,
+representing the fact that a process with the given PID terminated.
.TP
.BR \-\-config-net
diff --git a/pasta.c b/pasta.c
index 01d1511..61feaa9 100644
--- a/pasta.c
+++ b/pasta.c
@@ -33,6 +33,7 @@
#include <sys/timerfd.h>
#include <sys/types.h>
#include <sys/stat.h>
+#include <sys/statfs.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <signal.h>
@@ -41,6 +42,7 @@
#include <netinet/in.h>
#include <net/ethernet.h>
#include <sys/syscall.h>
+#include <linux/magic.h>
#include "util.h"
#include "passt.h"
@@ -390,12 +392,21 @@ void pasta_netns_quit_init(const struct ctx *c)
union epoll_ref ref = { .type = EPOLL_TYPE_NSQUIT_INOTIFY };
struct epoll_event ev = { .events = EPOLLIN };
int flags = O_NONBLOCK | O_CLOEXEC;
- int fd;
+ struct statfs s = { 0 };
+ bool try_inotify = true;
+ int fd = -1, dir_fd;
if (c->mode != MODE_PASTA || c->no_netns_quit || !*c->netns_base)
return;
- if ((fd = inotify_init1(flags)) < 0)
+ if ((dir_fd = open(c->netns_dir, O_CLOEXEC | O_RDONLY)) < 0)
+ die("netns dir open: %s, exiting", strerror(errno));
+
+ if (fstatfs(dir_fd, &s) || s.f_type == DEVPTS_SUPER_MAGIC ||
+ s.f_type == PROC_SUPER_MAGIC || s.f_type == SYSFS_MAGIC)
+ try_inotify = false;
+
+ if (try_inotify && (fd = inotify_init1(flags)) < 0)
warn("inotify_init1(): %s, use a timer", strerror(errno));
if (fd >= 0 && inotify_add_watch(fd, c->netns_dir, IN_DELETE) < 0) {
@@ -409,11 +420,11 @@ void pasta_netns_quit_init(const struct ctx *c)
if ((fd = pasta_netns_quit_timer()) < 0)
die("Failed to set up fallback netns timer, exiting");
- ref.nsdir_fd = open(c->netns_dir, O_CLOEXEC | O_RDONLY);
- if (ref.nsdir_fd < 0)
- die("netns dir open: %s, exiting", strerror(errno));
+ ref.nsdir_fd = dir_fd;
ref.type = EPOLL_TYPE_NSQUIT_TIMER;
+ } else {
+ close(dir_fd);
}
if (fd > FD_REF_MAX)
@@ -463,6 +474,9 @@ void pasta_netns_quit_timer_handler(struct ctx *c, union epoll_ref ref)
fd = openat(ref.nsdir_fd, c->netns_base, O_PATH | O_CLOEXEC);
if (fd < 0) {
+ if (errno == EACCES) /* Expected for existing procfs entry */
+ return;
+
info("Namespace %s is gone, exiting", c->netns_base);
exit(EXIT_SUCCESS);
}
--
2.39.2
We watch network namespace entries to detect when we should quit
(unless --no-netns-quit is passed), and these might stored in a tmpfs
typically mounted at /run/user/UID or /var/run/user/UID, or found in
procfs at /proc/PID/ns/.
Currently, we try to use inotify for any possible location of those
entries, but inotify, of course, doesn't work on pseudo-filesystems
(see inotify(7)).
The man page reflects this: the description of --no-netns-quit
implies that we won't quit anyway if the namespace is not "bound to
the filesystem".
Well, we won't quit, but, since commit 9e0dbc894813 ("More
deterministic detection of whether argument is a PID, PATH or NAME"),
we try. And, indeed, this is harmless, as the caveat from that
commit message states.
Now, it turns out that Buildah, a tool to create container images,
sharing its codebase with Podman, passes a procfs entry to pasta, and
expects pasta to exit once the network namespace is not needed
anymore, that is, once the original Buildah process terminates.
Get this to work by using the timer fallback mechanism if the
namespace name is passed as a path belonging to a pseudo-filesystem.
This is expected to be procfs, but I covered sysfs and devpts
pseudo-filesystems as well, because nothing actually prevents
creating this kind of directory structure and links there.
Note that fstatfs(), according to some versions of man pages, was
apparently "deprecated" by the LSB. My reasoning for using it is
essentially this:
https://lore.kernel.org/linux-man/f54kudgblgk643u32tb6at4cd3kkzha6hslahv24s…
...that is, there was no such thing as an LSB deprecation, and
anyway there's no other way to get the filesystem type.
Also note that, while it might sound more obvious to detect the
filesystem type using fstatfs() on the file descriptor itself
(c->pasta_netns_fd), the reported filesystem type for it is nsfs, no
matter what path was given to pasta. If we use the parent directory,
we'll typically have either tmpfs or procfs reported.
If the target naemsapce is given as a PID, or as a PID-based procfs
entry, we don't risk races if this PID is recycled: our handle on
/proc/PID/ns will always refer to the original namespace associated
with that PID, and we don't re-open this entry from procfs to check
it.
Instead of directly monitoring the target namespace, we could have
tried to monitor a process with a given PID, using pidfd_open() to
get a handle on it, to decide when to terminate.
But it's not guaranteed that the parent process is actually the one
associated to the network namespace we operate on, and if we get a
PID file descriptor for a PID (parent or not) or path that was given
on our command line, this inherently causes a race condition as that
PID might have been recycled by the time we call pidfd_open().
Even assuming the process we want to watch is the parent process, and
a race-free usage of pidfd_open() would have been possible, I'm not
very enthusiastic about enabling yet another system call in the
seccomp profile just for this, while openat() is needed anyway.
Update the man page to reflect that, even if the target network
namespace is passed as a procfs path or a PID, we'll now quit when
the procfs entry is gone.
Reported-by: Paul Holzinger <pholzing(a)redhat.com>
Link: https://github.com/containers/podman/pull/21563#issuecomment-1948200214
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
v3: Given that we now open c->netns_dir before checking the
filesystem type, we could as well pass this file descriptor to
fstatfs() to do the check, instead of statfs() on the path.
Fix a couple of paragraphs in the commit message.
v2: Coverity Scan isn't happy if we "check" (kind of) c->netns_dir
with statfs() before opening it in a non-atomic way. Just to make
things clear, false positive or not: open it, check it, close it
if it wasn't needed: we don't rely on the check.
passt.1 | 8 ++++++--
pasta.c | 24 +++++++++++++++++++-----
2 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/passt.1 b/passt.1
index dc2d719..de6e3bf 100644
--- a/passt.1
+++ b/passt.1
@@ -550,8 +550,12 @@ without \-\-userns.
.TP
.BR \-\-no-netns-quit
-If the target network namespace is bound to the filesystem (that is, if PATH or
-NAME are given as target), do not exit once the network namespace is deleted.
+If the target network namespace is bound to the filesystem, do not exit once
+that path is deleted.
+
+If the target network namespace is represented by a procfs entry, do not exit
+once that entry is removed from procfs (representing the fact that a process
+with the given PID terminated).
.TP
.BR \-\-config-net
diff --git a/pasta.c b/pasta.c
index 01d1511..61feaa9 100644
--- a/pasta.c
+++ b/pasta.c
@@ -33,6 +33,7 @@
#include <sys/timerfd.h>
#include <sys/types.h>
#include <sys/stat.h>
+#include <sys/statfs.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <signal.h>
@@ -41,6 +42,7 @@
#include <netinet/in.h>
#include <net/ethernet.h>
#include <sys/syscall.h>
+#include <linux/magic.h>
#include "util.h"
#include "passt.h"
@@ -390,12 +392,21 @@ void pasta_netns_quit_init(const struct ctx *c)
union epoll_ref ref = { .type = EPOLL_TYPE_NSQUIT_INOTIFY };
struct epoll_event ev = { .events = EPOLLIN };
int flags = O_NONBLOCK | O_CLOEXEC;
- int fd;
+ struct statfs s = { 0 };
+ bool try_inotify = true;
+ int fd = -1, dir_fd;
if (c->mode != MODE_PASTA || c->no_netns_quit || !*c->netns_base)
return;
- if ((fd = inotify_init1(flags)) < 0)
+ if ((dir_fd = open(c->netns_dir, O_CLOEXEC | O_RDONLY)) < 0)
+ die("netns dir open: %s, exiting", strerror(errno));
+
+ if (fstatfs(dir_fd, &s) || s.f_type == DEVPTS_SUPER_MAGIC ||
+ s.f_type == PROC_SUPER_MAGIC || s.f_type == SYSFS_MAGIC)
+ try_inotify = false;
+
+ if (try_inotify && (fd = inotify_init1(flags)) < 0)
warn("inotify_init1(): %s, use a timer", strerror(errno));
if (fd >= 0 && inotify_add_watch(fd, c->netns_dir, IN_DELETE) < 0) {
@@ -409,11 +420,11 @@ void pasta_netns_quit_init(const struct ctx *c)
if ((fd = pasta_netns_quit_timer()) < 0)
die("Failed to set up fallback netns timer, exiting");
- ref.nsdir_fd = open(c->netns_dir, O_CLOEXEC | O_RDONLY);
- if (ref.nsdir_fd < 0)
- die("netns dir open: %s, exiting", strerror(errno));
+ ref.nsdir_fd = dir_fd;
ref.type = EPOLL_TYPE_NSQUIT_TIMER;
+ } else {
+ close(dir_fd);
}
if (fd > FD_REF_MAX)
@@ -463,6 +474,9 @@ void pasta_netns_quit_timer_handler(struct ctx *c, union epoll_ref ref)
fd = openat(ref.nsdir_fd, c->netns_base, O_PATH | O_CLOEXEC);
if (fd < 0) {
+ if (errno == EACCES) /* Expected for existing procfs entry */
+ return;
+
info("Namespace %s is gone, exiting", c->netns_base);
exit(EXIT_SUCCESS);
}
--
2.39.2
We watch network namespace entries to detect when we should quit
(unless --no-netns-quit is passed), and these might stored in a tmpfs
typically mounted at /run/user/UID or /var/run/user/UID, or found in
procfs at /proc/PID/ns/.
Currently, we try to use inotify for any possible location of those
entries, but inotify, of course, doesn't work on pseudo-filesystems
(see inotify(7)).
The man page reflects this: the description of --no-netns-quit
implies that we won't quit anyway if the namespace is not "bound to
the filesystem".
Well, we won't quit, but, since commit 9e0dbc894813 ("More
deterministic detection of whether argument is a PID, PATH or NAME"),
we try. And, indeed, this is harmless, as the caveat from that
commit message states.
Now, it turns out that Buildah, a tool to create container images,
sharing its codebase with Podman, passes a procfs entry to pasta, and
expects pasta to exit once the network namespace is not needed
anymore, that is, once the original Buildah process terminates.
Get this to work by using the timer fallback mechanism if the
namespace name is passed as a path belonging to a pseudo-filesystem.
This is expected to be procfs, but I covered sysfs and devpts
pseudo-filesystems as well, because nothing actually prevents
creating this kind of directory structure and links there.
Note that statfs(), according to some versions of man pages, was
apparently "deprecated" by the LSB. My reasoning for using it is
essentially this:
https://lore.kernel.org/linux-man/f54kudgblgk643u32tb6at4cd3kkzha6hslahv24s…
...that is, there was no such thing as an LSB deprecation, and
anyway there's no other way to get the filesystem type.
Also note that, while it might sound more robust to detect the
filesystem type using fstatfs() on the file descriptor
(c->pasta_netns_fd), the reported filesystem type for it is nsfs, no
matter what path was given to pasta. If we use the parent directory,
we'll typically have either tmpfs or procfs reported.
The timer, however, still uses the file descriptor of the parent
directory later, as it has no access to the filesystem, and that
avoids possible races if the PID is recycled: if the original process
terminates, the handle we have on /proc/PID/ns still refers to it,
not to any other process with the same PID.
We could have used pidfd_open() to get a handle on the parent
process. But it's not guaranteed that the parent process is actually
the one associated to the network namespace we operate on, and if we
get a PID file descriptor for a PID (parent or not) or path that was
given on our command line, this inherently causes a race condition as
that PID might have been recycled by the time we call pidfd_open().
Even assuming the process we want to watch is the parent process, and
a race-free usage of pidfd_open() would have been possible, I'm not
very enthusiastic about enabling yet another system call in the
seccomp profile just for this, while openat() is needed anyway.
Update the man page to reflect that, even if the target network
namespace is passed as a procfs path or a PID, we'll now quit when
the procfs entry is gone.
Reported-by: Paul Holzinger <pholzing(a)redhat.com>
Link: https://github.com/containers/podman/pull/21563#issuecomment-1948200214
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
passt.1 | 8 ++++++--
pasta.c | 15 +++++++++++++--
2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/passt.1 b/passt.1
index dc2d719..de6e3bf 100644
--- a/passt.1
+++ b/passt.1
@@ -550,8 +550,12 @@ without \-\-userns.
.TP
.BR \-\-no-netns-quit
-If the target network namespace is bound to the filesystem (that is, if PATH or
-NAME are given as target), do not exit once the network namespace is deleted.
+If the target network namespace is bound to the filesystem, do not exit once
+that path is deleted.
+
+If the target network namespace is represented by a procfs entry, do not exit
+once that entry is removed from procfs (representing the fact that a process
+with the given PID terminated).
.TP
.BR \-\-config-net
diff --git a/pasta.c b/pasta.c
index 01d1511..4110917 100644
--- a/pasta.c
+++ b/pasta.c
@@ -33,6 +33,7 @@
#include <sys/timerfd.h>
#include <sys/types.h>
#include <sys/stat.h>
+#include <sys/statfs.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <signal.h>
@@ -41,6 +42,7 @@
#include <netinet/in.h>
#include <net/ethernet.h>
#include <sys/syscall.h>
+#include <linux/magic.h>
#include "util.h"
#include "passt.h"
@@ -390,12 +392,18 @@ void pasta_netns_quit_init(const struct ctx *c)
union epoll_ref ref = { .type = EPOLL_TYPE_NSQUIT_INOTIFY };
struct epoll_event ev = { .events = EPOLLIN };
int flags = O_NONBLOCK | O_CLOEXEC;
- int fd;
+ struct statfs s = { 0 };
+ bool try_inotify = true;
+ int fd = -1;
if (c->mode != MODE_PASTA || c->no_netns_quit || !*c->netns_base)
return;
- if ((fd = inotify_init1(flags)) < 0)
+ if (statfs(c->netns_dir, &s) || s.f_type == DEVPTS_SUPER_MAGIC ||
+ s.f_type == PROC_SUPER_MAGIC || s.f_type == SYSFS_MAGIC)
+ try_inotify = false;
+
+ if (try_inotify && (fd = inotify_init1(flags)) < 0)
warn("inotify_init1(): %s, use a timer", strerror(errno));
if (fd >= 0 && inotify_add_watch(fd, c->netns_dir, IN_DELETE) < 0) {
@@ -463,6 +471,9 @@ void pasta_netns_quit_timer_handler(struct ctx *c, union epoll_ref ref)
fd = openat(ref.nsdir_fd, c->netns_base, O_PATH | O_CLOEXEC);
if (fd < 0) {
+ if (errno == EACCES) /* Expected for existing procfs entry */
+ return;
+
info("Namespace %s is gone, exiting", c->netns_base);
exit(EXIT_SUCCESS);
}
--
2.39.2
We watch network namespace entries to detect when we should quit
(unless --no-netns-quit is passed), and these might stored in a tmpfs
typically mounted at /run/user/UID or /var/run/user/UID, or found in
procfs at /proc/PID/ns/.
Currently, we try to use inotify for any possible location of those
entries, but inotify, of course, doesn't work on pseudo-filesystems
(see inotify(7)).
The man page reflects this: the description of --no-netns-quit
implies that we won't quit anyway if the namespace is not "bound to
the filesystem".
Well, we won't quit, but, since commit 9e0dbc894813 ("More
deterministic detection of whether argument is a PID, PATH or NAME"),
we try. And, indeed, this is harmless, as the caveat from that
commit message states.
Now, it turns out that Buildah, a tool to create container images,
sharing its codebase with Podman, passes a procfs entry to pasta, and
expects pasta to exit once the network namespace is not needed
anymore, that is, once the original Buildah process terminates.
Get this to work by using the timer fallback mechanism if the
namespace name is passed as a path belonging to a pseudo-filesystem.
This is expected to be procfs, but I covered sysfs and devpts
pseudo-filesystems as well, because nothing actually prevents
creating this kind of directory structure and links there.
Note that statfs(), according to some versions of man pages, was
apparently "deprecated" by the LSB. My reasoning for using it is
essentially this:
https://lore.kernel.org/linux-man/f54kudgblgk643u32tb6at4cd3kkzha6hslahv24s…
...that is, there was no such thing as an LSB deprecation, and
anyway there's no other way to get the filesystem type.
Also note that, while it might sound more robust to detect the
filesystem type using fstatfs() on the file descriptor
(c->pasta_netns_fd), the reported filesystem type for it is nsfs, no
matter what path was given to pasta. If we use the parent directory,
we'll typically have either tmpfs or procfs reported.
The timer, however, still uses the file descriptor of the parent
directory later, as it has no access to the filesystem, and that
avoids possible races if the PID is recycled: if the original process
terminates, the handle we have on /proc/PID/ns still refers to it,
not to any other process with the same PID.
We could have used pidfd_open() to get a handle on the parent
process. But it's not guaranteed that the parent process is actually
the one associated to the network namespace we operate on, and if we
get a PID file descriptor for a PID (parent or not) or path that was
given on our command line, this inherently causes a race condition as
that PID might have been recycled by the time we call pidfd_open().
Even assuming the process we want to watch is the parent process, and
a race-free usage of pidfd_open() would have been possible, I'm not
very enthusiastic about enabling yet another system call in the
seccomp profile just for this, while openat() is needed anyway.
Update the man page to reflect that, even if the target network
namespace is passed as a procfs path or a PID, we'll now quit when
the procfs entry is gone.
Reported-by: Paul Holzinger <pholzing(a)redhat.com>
Link: https://github.com/containers/podman/pull/21563#issuecomment-1948200214
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
v2: Coverity Scan isn't happy if we "check" (kind of) c->netns_dir
with statfs() before opening it in a non-atomic way. Just to make
things clear, false positive or not: open it, check it, close it
if it wasn't needed: we don't rely on the check.
passt.1 | 8 ++++++--
pasta.c | 24 +++++++++++++++++++-----
2 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/passt.1 b/passt.1
index dc2d719..de6e3bf 100644
--- a/passt.1
+++ b/passt.1
@@ -550,8 +550,12 @@ without \-\-userns.
.TP
.BR \-\-no-netns-quit
-If the target network namespace is bound to the filesystem (that is, if PATH or
-NAME are given as target), do not exit once the network namespace is deleted.
+If the target network namespace is bound to the filesystem, do not exit once
+that path is deleted.
+
+If the target network namespace is represented by a procfs entry, do not exit
+once that entry is removed from procfs (representing the fact that a process
+with the given PID terminated).
.TP
.BR \-\-config-net
diff --git a/pasta.c b/pasta.c
index 01d1511..465fe1a 100644
--- a/pasta.c
+++ b/pasta.c
@@ -33,6 +33,7 @@
#include <sys/timerfd.h>
#include <sys/types.h>
#include <sys/stat.h>
+#include <sys/statfs.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <signal.h>
@@ -41,6 +42,7 @@
#include <netinet/in.h>
#include <net/ethernet.h>
#include <sys/syscall.h>
+#include <linux/magic.h>
#include "util.h"
#include "passt.h"
@@ -390,12 +392,21 @@ void pasta_netns_quit_init(const struct ctx *c)
union epoll_ref ref = { .type = EPOLL_TYPE_NSQUIT_INOTIFY };
struct epoll_event ev = { .events = EPOLLIN };
int flags = O_NONBLOCK | O_CLOEXEC;
- int fd;
+ struct statfs s = { 0 };
+ bool try_inotify = true;
+ int fd = -1, dir_fd;
if (c->mode != MODE_PASTA || c->no_netns_quit || !*c->netns_base)
return;
- if ((fd = inotify_init1(flags)) < 0)
+ if ((dir_fd = open(c->netns_dir, O_CLOEXEC | O_RDONLY)) < 0)
+ die("netns dir open: %s, exiting", strerror(errno));
+
+ if (statfs(c->netns_dir, &s) || s.f_type == DEVPTS_SUPER_MAGIC ||
+ s.f_type == PROC_SUPER_MAGIC || s.f_type == SYSFS_MAGIC)
+ try_inotify = false;
+
+ if (try_inotify && (fd = inotify_init1(flags)) < 0)
warn("inotify_init1(): %s, use a timer", strerror(errno));
if (fd >= 0 && inotify_add_watch(fd, c->netns_dir, IN_DELETE) < 0) {
@@ -409,11 +420,11 @@ void pasta_netns_quit_init(const struct ctx *c)
if ((fd = pasta_netns_quit_timer()) < 0)
die("Failed to set up fallback netns timer, exiting");
- ref.nsdir_fd = open(c->netns_dir, O_CLOEXEC | O_RDONLY);
- if (ref.nsdir_fd < 0)
- die("netns dir open: %s, exiting", strerror(errno));
+ ref.nsdir_fd = dir_fd;
ref.type = EPOLL_TYPE_NSQUIT_TIMER;
+ } else {
+ close(dir_fd);
}
if (fd > FD_REF_MAX)
@@ -463,6 +474,9 @@ void pasta_netns_quit_timer_handler(struct ctx *c, union epoll_ref ref)
fd = openat(ref.nsdir_fd, c->netns_base, O_PATH | O_CLOEXEC);
if (fd < 0) {
+ if (errno == EACCES) /* Expected for existing procfs entry */
+ return;
+
info("Namespace %s is gone, exiting", c->netns_base);
exit(EXIT_SUCCESS);
}
--
2.39.2
...Podman users might get confused by the fact that if we can't
find a default route for a given IP version, we'll report that as a
warning message and possibly just before actual error messages.
However, a lack of routable interface for IPv4 or IPv6 can be a
normal circumstance: don't warn about it, just state that as
informational message, if those are displayed (they're not in
non-error paths in Podman, for example).
While at it, make it clear that we're disabling IPv4 or IPv6 if
there's no routable interface for the corresponding IP version.
Reported-by: Paul Holzinger <pholzing(a)redhat.com>
Link: https://github.com/containers/podman/pull/21563#issuecomment-1937024642
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
v2: Report that we're disabling IPv4 or IPv6 in the message
conf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/conf.c b/conf.c
index 5e15b66..3646700 100644
--- a/conf.c
+++ b/conf.c
@@ -579,7 +579,7 @@ static unsigned int conf_ip4(unsigned int ifi,
ifi = nl_get_ext_if(nl_sock, AF_INET);
if (!ifi) {
- warn("No external routable interface for IPv4");
+ info("No routable interface for IPv4: IPv4 is disabled");
return 0;
}
@@ -651,7 +651,7 @@ static unsigned int conf_ip6(unsigned int ifi,
ifi = nl_get_ext_if(nl_sock, AF_INET6);
if (!ifi) {
- warn("No external routable interface for IPv6");
+ info("No routable interface for IPv6: IPv6 is disabled");
return 0;
}
--
2.39.2
The new version with tag 2024_02_16.08344da includes the following changes:
08344da selinux: Allow pasta to remount procfs
338b632 conf: No routable interface for IPv4 or IPv6 is informational, not a warning
8f3f8e1 pasta: Add fallback timer mechanism to check if namespace is gone
f57a2fb conf, passt.1: Exit if we can't bind a forwarded port, except for -[tu] all
927cb84 udp: udp_sock_init_ns() partially duplicats udp_port_rebind_outbound()
96ad5c5 udp: Don't prematurely (and incorrectly) set up automatic inbound forwards
9f57983 netlink: Use const rtnh pointer
7ee4e17 log: setlogmask(0) can actually result in a system call, don't use it
78901ee tcp: Fix subtle bug in fast re-transmit path
6c7623d netlink: Add support to fetch default gateway from multipath routes
322660b icmp: Dedicated functions for starting and closing ping sequences
b6a4e20 icmp: Validate packets received on ping sockets
6e86511 icmp: Warn on receive errors from ping sockets
a325121 icmp: Consolidate icmp_sock_handler() with icmpv6_sock_handler()
70d43f9 icmp: Share more between IPv4 and IPv6 paths in icmp_tap_handler()
15be1bf icmp: Simplify socket expiry scanning
24badd0 icmp: Use -1 to represent "missing" sockets
43713af icmp: Don't attempt to match host IDs to guest IDs
8534cdb icmp: Don't attempt to handle "wrong direction" ping socket traffic
2cb2fe6 icmp: Remove redundant initialisation of sendto() address
5dffb99 icmp: Don't set "port" on destination sockaddr for ping sockets
8981a72 flow: Avoid moving flow entries to compact table
9c0881d flow: Enforce that freeing of closed flows must happen in deferred handlers
4a849e9 flow: Abstract allocation of new flows with helper function
fb7c001 flow: Move flow_count from context structure to a global
7f37bf4 flow: Move flow_log_() to near top of flow.c
02e092b tcp, tcp_splice: Avoid double layered dispatch for connected TCP sockets
70121ca epoll: Better handling of number of epoll types
36dfa8b flow, tcp: Add handling for per-flow timers
b43e448 flow, tcp: Add flow-centric dispatch for deferred flow handling
c97bb52 tcp, tcp_splice: Move per-type cleanup logic into per-type helpers
eebca11 tcp, tcp_splice: Remove redundant handling from tcp_timer()
8563e7c treewide: Standardise on 'now' for current timestamp variables
17bbab1 flow: Make flow_table.h #include the protocol specific headers it needs
00c6eb6 pif: Remove unused pif_name() function
a179ca6 treewide: Make a bunch of pointer variables pointers to const
f60c851 test: Fix passt.mbuto for cases where /usr/sbin doesn't exist
https://passt.top/passt/log/?qt=range&q=2023_12_30.f091893..2024_02_16.0834…
Packages:
- Arch Linux:
https://www.archlinux.org/packages/extra/x86_64/passt/https://archlinuxarm.org/packages/aarch64/passthttps://archlinuxarm.org/packages/armv7h/passt
- Debian tracker:
https://tracker.debian.org/pkg/passt
- Copr (CentOS Stream, EPEL, Fedora, Mageia):
https://copr.fedorainfracloud.org/coprs/sbrivio/passt/build/7022413/
permanent mirror: https://passt.top/builds/copr/0^20240216.g08344da/
- Fedora updates:
https://bodhi.fedoraproject.org/updates/?packages=passt
- Gentoo versions:
https://packages.gentoo.org/packages/net-misc/passt
- Ubuntu tracker:
https://launchpad.net/ubuntu/+source/passt
- Void Linux:
https://voidlinux.org/packages/?q=passt
- Static builds:
- Package for other RPM-based distributions, x86_64 only:
https://passt.top/builds/latest/x86_64/passt-g08344da-1.x86_64.rpm
- x86_64 static binaries:
https://passt.top/builds/latest/x86_64/
- Debian package, from x86_64 static build:
https://passt.top/builds/latest/x86_64/passt_08344da-1_all.deb
--
Stefano