The current semantics for selecting an external interface are quite
confusing - depending on details it can pick either the interface
associated with the first default route, or the lowest numbered
interface with a default route, which might not be the same.The logic
for checking the interface in the tests isn't quite identical which
can lead to test failures when there are multiple external routes.
This series fixes that bug and makes a number of follow on clean ups
to the detection / configuration of IP parameters from the host.
David Gibson (7):
Allow different external interfaces for IPv4 and IPv6 connectivity
Separately locate external interfaces for IPv4 and IPv6
Initialize host side MAC when in IPv6 only mode
Move passt mac_guest init to be more symmetric with pasta
Clarify semantics of c->v4 and c->v6 variables
Separate IPv4 and IPv6 configuration
Make substructures for IPv4 and IPv6 specific context information
arp.c | 2 +-
conf.c | 326 ++++++++++++++++++++++--------------------
dhcp.c | 22 +--
dhcpv6.c | 18 +--
ndp.c | 16 +--
netlink.c | 79 +---------
netlink.h | 2 +-
passt.c | 6 +-
passt.h | 78 +++++-----
pasta.c | 14 +-
tap.c | 32 +++--
tcp.c | 56 ++++----
test/dhcp/passt | 3 +-
test/dhcp/pasta | 3 +-
test/ndp/passt | 4 +-
test/two_guests/basic | 3 +-
udp.c | 70 ++++-----
util.c | 6 +-
util.h | 6 -
19 files changed, 357 insertions(+), 389 deletions(-)
--
2.37.1
Reflect the changes from commit 4b2e018d70f3 ("Allow different
external interfaces for IPv4 and IPv6 connectivity") into the manual.
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
passt.1 | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/passt.1 b/passt.1
index 4e06c0c..378778c 100644
--- a/passt.1
+++ b/passt.1
@@ -138,8 +138,8 @@ Assign IPv4 \fIaddr\fR via DHCP (\fByiaddr\fR), or \fIaddr\fR via DHCPv6 (option
for an IPv6 \fIaddr\fR.
This option can be specified zero (for defaults) to two times (once for IPv4,
once for IPv6).
-By default, assigned IPv4 and IPv6 addresses are taken from the host interface
-with the first default route.
+By default, assigned IPv4 and IPv6 addresses are taken from the host interfaces
+with the first default route for the corresponding IP version.
.TP
.BR \-n ", " \-\-netmask " " \fImask
@@ -153,8 +153,8 @@ according to the CIDR block of the assigned address (RFC 4632).
.BR \-M ", " \-\-mac-addr " " \fIaddr
Use source MAC address \fIaddr\fR when communicating to the guest or to the
target namespace.
-Default is to use the MAC address of the interface with the first default route
-on the host.
+Default is to use the MAC address of the interface with the first IPv4 default
+route on the host.
.TP
.BR \-g ", " \-\-gateway " " \fIaddr
@@ -163,7 +163,7 @@ Assign IPv4 \fIaddr\fR as default gateway via DHCP (option 3), or IPv6
This option can be specified zero (for defaults) to two times (once for IPv4,
once for IPv6).
By default, IPv4 and IPv6 addresses are taken from the host interface with the
-first default route.
+first default route for the corresponding IP version.
Note: these addresses are also used as source address for packets directed to
the guest or to the target namespace having a loopback or local source address,
@@ -173,7 +173,8 @@ to allow mapping of local traffic to guest and target namespace. See the
.TP
.BR \-i ", " \-\-interface " " \fIname
Use host interface \fIname\fR to derive addresses and routes.
-Default is to use the interface with the first default route.
+Default is to use the interfaces with the first default routes for each IP
+version.
.TP
.BR \-D ", " \-\-dns " " \fIaddr
--
2.35.1
A couple of days ago, we started running out of space there as we're
about to install gcc -- about 50 MiB are missing.
Given that virt-resize (which could be conveniently invoked by the
Makefile for tests) reorders partitions if we expand the first one,
resize the image using qemu-img from the test script itself, and then
take care of expanding root partition and filesystem online later.
This is probably a temporary hack, so I'm not looking for a more
generic or elegant solution at the moment.
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
test/distro/debian | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/test/distro/debian b/test/distro/debian
index abbbaa2..1548761 100644
--- a/test/distro/debian
+++ b/test/distro/debian
@@ -184,10 +184,19 @@ sleep 1
hostb reset
+# HACK: We need some additional space to install gcc-12 on 'sid' images for
+# amd64 and aarch64, but if we use virt-resize to call resize2fs in the
+# preparation step, partitions will be rearranged and we would also need to
+# adjust boot parameters. Instead, resize the images offline first, and expand
+# partitions and filesystems online, later.
+
test Debian GNU/Linux sid (experimental), amd64
+host qemu-img resize __BASEPATH__/prepared-debian-sid-nocloud-amd64-daily.qcow2 4G
host ./qrap 5 qemu-system-x86_64 -M pc,accel=kvm:tcg -m 1024 -nographic -serial stdio -nodefaults -no-reboot -nographic -vga none __BASEPATH__/prepared-debian-sid-nocloud-amd64-daily.qcow2 -net socket,fd=5 -net nic,model=virtio -snapshot
sleep 2
+host growpart /dev/sda 1
+host resize2fs -p /dev/sda1
host apt-get update
host apt-get -y install make gcc netcat-openbsd
@@ -202,8 +211,11 @@ sleep 1
test Debian GNU/Linux sid (experimental), aarch64
+host qemu-img resize __BASEPATH__/prepared-debian-sid-nocloud-arm64-daily.qcow2 4G
host ./qrap 5 qemu-system-aarch64 -m 2048 -cpu cortex-a57 -smp 2 -M virt -bios __BASEPATH__/QEMU_EFI.fd -nographic -serial stdio -nodefaults -no-reboot -nographic -vga none __BASEPATH__/prepared-debian-sid-nocloud-arm64-daily.qcow2 -net socket,fd=5 -net nic,model=virtio -snapshot
sleep 2
+host growpart /dev/vda 1
+host resize2fs -p /dev/vda1
host apt-get update
host apt-get -y install make gcc netcat-openbsd
--
2.35.1
Here's yet another batch of fixes to make the tests more robust
against different environments. With this lot, I'm now able to run
the pasta, passt, passt_in_ns and two_guests tests on my Fedora
system. I'm still hitting problems with the perf tests.
This series (specifically 11/18) updates the demo to use socat instead
of openbsd netcat. This is needed, or the change to socat in the
mbuto image would break the demo. However, I'm hitting unrelated
problems trying to run the demos, so the switch to socat is, alas,
untested.
David Gibson (18):
tests: Remove no longer needed /usr/bin/bash link
tests: Let Fedora find dhclient-script in /usr/sbin
tests: Add rudimentary debugging to dhclient-script
tests: Add some extra dhclient support directories to mbuto.img
tests: More robust parsing of resolv.conf for DHCP tests
tests: Handle the case of a nameserver on host localhost
tests: Correctly handle domain search list in dhclient-script
tests: Fix detection of empty 'hout' responses in passt{,_in_ns} tests
tests: Fix creation of test file in udp passt tests
valgrind needs futex
tests: Use socat instead of netcat
tests: Remove unnecessary ^D in passt_in_ns teardown
tests: Remove unnecessary truncation of temporary files in udp tests
tests: Use dhclient --no-pid for namespaces in two_guests tests
tests: Clean up better after iperf tests
tests: No need to retrieve host ifname in ndp/pasta
tests: Correct determination of host interface name in tests
demo: Use git protocol downloads
Makefile | 2 +-
test/README.md | 2 +-
test/demo/passt | 10 +--
test/demo/pasta | 14 ++---
test/dhcp/passt | 22 +++----
test/lib/setup | 24 ++++++--
test/lib/test | 2 +-
test/ndp/pasta | 5 +-
test/passt.mbuto | 18 +++---
test/tcp/passt | 36 +++++------
test/tcp/passt_in_ns | 138 +++++++++++++++++++++---------------------
test/tcp/pasta | 58 +++++++++---------
test/two_guests/basic | 24 ++++----
test/udp/passt | 26 ++++----
test/udp/passt_in_ns | 82 +++++++++++--------------
test/udp/pasta | 34 +++++------
16 files changed, 247 insertions(+), 250 deletions(-)
--
2.36.1
I resorted to skip building demos for a while as they didn't work
reliably anymore -- time to fix that.
Stefano Brivio (7):
contrib: Rebase Podman patch to latest upstream
test: In passt demo, bring up eth0 in guest, not in namespace pane
test: In pasta demo, use pgrep instead of pstree to find namespace PID
test: In pasta demo, issue /sbin/dhclient instead of dhclient
test: Fix Podman build in Podman demo
test: Actually use pasta in Podman demo step with HTTP service
test: Drop further ^D in passt demo teardown
...001-libpod-Add-pasta-networking-mode.patch | 91 +++++++++----------
test/demo/passt | 2 +-
test/demo/pasta | 13 ++-
test/demo/podman | 4 +-
test/lib/setup | 4 -
5 files changed, 51 insertions(+), 63 deletions(-)
--
2.35.1
The intended semantics of --netns-only are pretty unclear to me. It's
intended for pasta, but it's not clear whether its saying the spawned shell
should only enter the target netns, or that the passt/pasta packet
forwarding process should only sandbox itself in a network namespace, not
a user namespace.
In any case, as far as I can tell there's not actually any case in which
the --netns-only option will work. If nothing else, we will always fail
in sandbox(), because it attempts a number of operations which require
CAP_SYS_ADMIN in our current user namespace. We drop all capabilities in
our initial user namespace when we start, so the only way we can have
CAP_SYS_ADMIN at this point is if we've joined a new user namespace, which
we won't do with --netns-only.
For pasta joining an existing namespace (the apparently intended use case), we'll actually fail before
we'll fail before we get to that point: in conf_ns_check() we'll attempt
to join the target network namespace. This also requires CAP_SYS_ADMIN in
both our current user namespace and the user namespace which owns the
target network namespace. Again, since we've dropped capabilities in our
original namespace this will never be the case.
For pasta creating its own network namespace we'll fail for a similar
reason in yet another place. This time we'll fail in nl_sock_init() again
because we attempt to enter the new network ns via NS_CALL without having
regained CAP_SYS_ADMIN by joining a new user namespace. Because this
happens after spawning the shell, it results in a weird failure mode, where
the pasta spawned shell is running, but pasta isn't actually handling
packets. Exiting the shell will lead to a hang until the process is
explicitly killed.
Since there's no way to invoke it, remove this feature.
Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au>
---
conf.c | 33 ++++++++++-----------------------
passt.c | 10 ++++------
passt.h | 2 --
pasta.c | 28 ++++++++++------------------
util.c | 3 +--
5 files changed, 25 insertions(+), 51 deletions(-)
diff --git a/conf.c b/conf.c
index cddc769..1dfbba1 100644
--- a/conf.c
+++ b/conf.c
@@ -498,7 +498,7 @@ static int conf_ns_check(void *arg)
{
struct ctx *c = (struct ctx *)arg;
- if ((!c->netns_only && setns(c->pasta_userns_fd, CLONE_NEWUSER)) ||
+ if (setns(c->pasta_userns_fd, CLONE_NEWUSER) ||
setns(c->pasta_netns_fd, CLONE_NEWNET))
c->pasta_userns_fd = c->pasta_netns_fd = -1;
@@ -518,17 +518,12 @@ static int conf_ns_check(void *arg)
static int conf_ns_opt(struct ctx *c,
char *nsdir, const char *conf_userns, const char *optarg)
{
- int ufd = -1, nfd = -1, try, ret, netns_only_reset = c->netns_only;
+ int ufd = -1, nfd = -1, try, ret;
char userns[PATH_MAX] = { 0 }, netns[PATH_MAX];
char *endptr;
long pid_arg;
pid_t pid;
- if (c->netns_only && *conf_userns) {
- err("Both --userns and --netns-only given");
- return -EINVAL;
- }
-
/* It might be a PID, a netns path, or a netns name */
for (try = 0; try < 3; try++) {
if (try == 0) {
@@ -538,7 +533,7 @@ static int conf_ns_opt(struct ctx *c,
pid = pid_arg;
- if (!*conf_userns && !c->netns_only) {
+ if (!*conf_userns) {
ret = snprintf(userns, PATH_MAX,
"/proc/%i/ns/user", pid);
if (ret <= 0 || ret > (int)sizeof(userns))
@@ -548,9 +543,6 @@ static int conf_ns_opt(struct ctx *c,
if (ret <= 0 || ret > (int)sizeof(netns))
continue;
} else if (try == 1) {
- if (!*conf_userns)
- c->netns_only = 1;
-
ret = snprintf(netns, PATH_MAX, "%s", optarg);
if (ret <= 0 || ret > (int)sizeof(userns))
continue;
@@ -562,19 +554,17 @@ static int conf_ns_opt(struct ctx *c,
}
/* Don't pass O_CLOEXEC here: ns_enter() needs those files */
- if (!c->netns_only) {
- if (*conf_userns)
- /* NOLINTNEXTLINE(android-cloexec-open) */
- ufd = open(conf_userns, O_RDONLY);
- else if (*userns)
- /* NOLINTNEXTLINE(android-cloexec-open) */
- ufd = open(userns, O_RDONLY);
- }
+ if (*conf_userns)
+ /* NOLINTNEXTLINE(android-cloexec-open) */
+ ufd = open(conf_userns, O_RDONLY);
+ else if (*userns)
+ /* NOLINTNEXTLINE(android-cloexec-open) */
+ ufd = open(userns, O_RDONLY);
/* NOLINTNEXTLINE(android-cloexec-open) */
nfd = open(netns, O_RDONLY);
- if (nfd == -1 || (ufd == -1 && !c->netns_only)) {
+ if (nfd == -1 || ufd == -1) {
if (nfd >= 0)
close(nfd);
@@ -604,8 +594,6 @@ static int conf_ns_opt(struct ctx *c,
}
}
- c->netns_only = netns_only_reset;
-
return -ENOENT;
}
@@ -1046,7 +1034,6 @@ void conf(struct ctx *c, int argc, char **argv)
{"tcp-ns", required_argument, NULL, 'T' },
{"udp-ns", required_argument, NULL, 'U' },
{"userns", required_argument, NULL, 2 },
- {"netns-only", no_argument, &c->netns_only, 1 },
{"nsrun-dir", required_argument, NULL, 3 },
{"config-net", no_argument, &c->pasta_conf_ns, 1 },
{"ns-mac-addr", required_argument, NULL, 4 },
diff --git a/passt.c b/passt.c
index 58d9062..64edd39 100644
--- a/passt.c
+++ b/passt.c
@@ -199,12 +199,10 @@ static int sandbox(struct ctx *c)
{
int flags = CLONE_NEWIPC | CLONE_NEWNS | CLONE_NEWUTS;
- if (!c->netns_only) {
- if (c->pasta_userns_fd == -1)
- flags |= CLONE_NEWUSER;
- else
- setns(c->pasta_userns_fd, CLONE_NEWUSER);
- }
+ if (c->pasta_userns_fd == -1)
+ flags |= CLONE_NEWUSER;
+ else
+ setns(c->pasta_userns_fd, CLONE_NEWUSER);
c->pasta_userns_fd = -1;
diff --git a/passt.h b/passt.h
index e541341..7f9d54b 100644
--- a/passt.h
+++ b/passt.h
@@ -110,7 +110,6 @@ enum passt_modes {
* @gid: GID we should drop to, if started as root
* @pasta_netns_fd: File descriptor for network namespace in pasta mode
* @pasta_userns_fd: Descriptor for user namespace to join, -1 once joined
- * @netns_only: In pasta mode, don't join or create a user namespace
* @no_netns_quit: In pasta mode, don't exit if fs-bound namespace is gone
* @netns_base: Base name for fs-bound namespace, if any, in pasta mode
* @netns_dir: Directory of fs-bound namespace, if any, in pasta mode
@@ -177,7 +176,6 @@ struct ctx {
int pasta_netns_fd;
int pasta_userns_fd;
- int netns_only;
int no_netns_quit;
char netns_base[PATH_MAX];
diff --git a/pasta.c b/pasta.c
index 5166082..dc35fef 100644
--- a/pasta.c
+++ b/pasta.c
@@ -82,16 +82,12 @@ static int pasta_wait_for_ns(void *arg)
int flags = O_RDONLY | O_CLOEXEC;
char ns[PATH_MAX];
- if (c->netns_only)
- goto netns;
-
snprintf(ns, PATH_MAX, "/proc/%i/ns/user", pasta_child_pid);
do
while ((c->pasta_userns_fd = open(ns, flags)) < 0);
while (setns(c->pasta_userns_fd, CLONE_NEWUSER) &&
!close(c->pasta_userns_fd));
-netns:
snprintf(ns, PATH_MAX, "/proc/%i/ns/net", pasta_child_pid);
do
while ((c->pasta_netns_fd = open(ns, flags)) < 0);
@@ -121,21 +117,18 @@ static int pasta_setup_ns(void *arg)
{
struct pasta_setup_ns_arg *a = (struct pasta_setup_ns_arg *)arg;
char *shell;
+ char buf[BUFSIZ];
- if (!a->c->netns_only) {
- char buf[BUFSIZ];
-
- snprintf(buf, BUFSIZ, "%i %i %i", 0, a->euid, 1);
+ snprintf(buf, BUFSIZ, "%i %i %i", 0, a->euid, 1);
- FWRITE("/proc/self/uid_map", buf,
- "Cannot set uid_map in namespace");
+ FWRITE("/proc/self/uid_map", buf,
+ "Cannot set uid_map in namespace");
- FWRITE("/proc/self/setgroups", "deny",
- "Cannot write to setgroups in namespace");
+ FWRITE("/proc/self/setgroups", "deny",
+ "Cannot write to setgroups in namespace");
- FWRITE("/proc/self/gid_map", buf,
- "Cannot set gid_map in namespace");
- }
+ FWRITE("/proc/self/gid_map", buf,
+ "Cannot set gid_map in namespace");
FWRITE("/proc/sys/net/ipv4/ping_group_range", "0 0",
"Cannot set ping_group_range, ICMP requests might fail");
@@ -165,9 +158,8 @@ void pasta_start_ns(struct ctx *c)
pasta_child_pid = clone(pasta_setup_ns,
ns_fn_stack + sizeof(ns_fn_stack) / 2,
- (c->netns_only ? 0 : CLONE_NEWNET) |
- CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWUSER |
- CLONE_NEWUTS,
+ CLONE_NEWNET | CLONE_NEWIPC | CLONE_NEWPID |
+ CLONE_NEWUSER | CLONE_NEWUTS,
(void *)&arg);
if (pasta_child_pid == -1) {
diff --git a/util.c b/util.c
index f45bc72..11cd3f4 100644
--- a/util.c
+++ b/util.c
@@ -524,8 +524,7 @@ void check_root(struct ctx *c)
*/
int ns_enter(const struct ctx *c)
{
- if (!c->netns_only &&
- c->pasta_userns_fd != -1 &&
+ if (c->pasta_userns_fd != -1 &&
setns(c->pasta_userns_fd, CLONE_NEWUSER))
exit(EXIT_FAILURE);
--
2.36.1
We handle SIGQUIT and SIGTERM calling exit(), which is usually
implemented with the exit_group() system call.
If we don't allow exit_group(), we'll get a SIGSYS while handling
SIGQUIT and SIGTERM, which means a misleading non-zero exit code.
Reported-by: Wenli Quan <wquan(a)redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2101990
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
Makefile | 2 +-
README.md | 2 +-
passt.c | 2 ++
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile
index 0077fc9..6f7c971 100644
--- a/Makefile
+++ b/Makefile
@@ -115,7 +115,7 @@ qrap: $(QRAP_SRCS) passt.h
valgrind: EXTRA_SYSCALLS="rt_sigprocmask rt_sigtimedwait rt_sigaction \
getpid gettid kill clock_gettime mmap munmap open \
- unlink exit_group gettimeofday"
+ unlink gettimeofday"
valgrind: CFLAGS:=-g -O0 $(filter-out -O%,$(CFLAGS))
valgrind: all
diff --git a/README.md b/README.md
index 4fed6d5..628b9bb 100644
--- a/README.md
+++ b/README.md
@@ -286,7 +286,7 @@ speeding up local connections, and usually requiring NAT. _pasta_:
* ✅ all capabilities dropped, other than `CAP_NET_BIND_SERVICE` (if granted)
* ✅ with default options, user, mount, IPC, UTS, PID namespaces are detached
* ✅ no external dependencies (other than a standard C library)
-* ✅ restrictive seccomp profiles (25 syscalls allowed for _passt_, 39 for
+* ✅ restrictive seccomp profiles (26 syscalls allowed for _passt_, 40 for
_pasta_ on x86_64)
* ✅ examples of [AppArmor](/passt/tree/contrib/apparmor) and
[SELinux](/passt/tree/contrib/selinux) profiles available
diff --git a/passt.c b/passt.c
index 56fcf5f..a8d94b4 100644
--- a/passt.c
+++ b/passt.c
@@ -257,6 +257,8 @@ static int sandbox(struct ctx *c)
*
* TODO: After unsharing the PID namespace and forking, SIG_DFL for SIGTERM and
* SIGQUIT unexpectedly doesn't cause the process to terminate, figure out why.
+ *
+ * #syscalls exit_group
*/
void exit_handler(int signal)
{
--
2.35.1