We already had a couple of places we were working around clang-tidy
issue 58992, and the flow table series adds more. I got sick of ugly
inlines every time we used a syscall which returns a socket address,
so wrote a patch to consolidate the workarounds in one place.
However, that patch added an include of <string.h> to util.h which
exposed a classic C library gotcha in packet.c, so I fixed that too.
David Gibson (2):
packet: Avoid shadowing index(3)
util: Consolidate and improve …
[View More]workarounds for clang-tidy issue 58992
Makefile | 2 +-
icmp.c | 5 -----
packet.c | 28 ++++++++++++++--------------
packet.h | 10 +++++-----
tcp.c | 8 +-------
util.h | 41 +++++++++++++++++++++++++++++++++++++++++
6 files changed, 62 insertions(+), 32 deletions(-)
--
2.41.0
[View Less]
Problem: I have a Cloud Hypervisor virtual machine that needs both
(1) an internet access without fiddling with iptables/Netfilter and
(2) VM <-> host access (to be able to provision this VM over SSH)
without dealing with passt port forwarding it doesn't seem to be
possible to map the whole IP address, yet the users expect an IP
instead of IP:port combination.
Requirement #1 is why I've choosen passt and it's pretty much
satisfied (thank you for this great piece of software!).
…
[View More]Requirement #2 implies some kind of bridge interface on the host
with one TAP interface for the VM and the other for the passt.
However, only pasta can accept TAP interface FD's in it's -F/--fd,
which is OK, but it also configures unneeded namespacing, which in
turn results in unneeded complexity and performance overhead due
to the need of involving veth pairs to break away from the pasta
namespace to the host for the requirement #2 to be satisfied.
I've also considered proxying the UNIX domain socket communication
to/from a TAP interface in my own Golang code, but it incurs
significant performance overhead.
On the other hand passt seems to already can do everything I need,
it just needs some guidance on which type of FD it's dealing with.
Solution: introduce --fd-is-tap command-line flag to tell passt
which type of FD it's being passed to and force it to use appropriate
system calls and offset calculation.
This patch also clarifies the -F/--fd description for pasta to note
that we're expecting a TAP device and not a UNIX domain socket.
---
README.md | 4 ++++
conf.c | 14 +++++++++++++-
passt.c | 2 ++
passt.h | 1 +
tap.c | 8 +++++---
tap.h | 4 ++--
6 files changed, 27 insertions(+), 6 deletions(-)
diff --git a/README.md b/README.md
index 6d00313..a78288f 100644
--- a/README.md
+++ b/README.md
@@ -381,6 +381,10 @@ descriptor that's already opened.
This approach, compared to using a _tap_ device, doesn't require any security
capabilities, as we don't need to create any interface.
+However, if you already have a _tap_ device opened by other means, you can
+specify `--fd-is-tap` command-line option and _passt_ will treat the file
+descriptor passed in `-F`/`--fd` option as a pre-opened TAP device.
+
_pasta_ runs out of the box with any recent (post-3.8) Linux kernel.
## Services
diff --git a/conf.c b/conf.c
index 0ad6e23..d622fdf 100644
--- a/conf.c
+++ b/conf.c
@@ -803,7 +803,12 @@ static void print_usage(const char *name, int status)
UNIX_SOCK_PATH, 1);
}
- info( " -F, --fd FD Use FD as pre-opened connected socket");
+ if (strstr(name, "pasta")) {
+ info( " -F, --fd FD Use FD as pre-opened TAP device");
+ } else {
+ info( " -F, --fd FD Use FD as pre-opened and connected UNIX domain socket");
+ info( " --fd-is-tap Treat FD as pre-opened TAP device instead of connected UNIX domain socket");
+ }
info( " -p, --pcap FILE Log tap-facing traffic to pcap file");
info( " -P, --pid FILE Write own PID to the given file");
info( " -m, --mtu MTU Assign MTU via DHCP/NDP");
@@ -1232,6 +1237,7 @@ void conf(struct ctx *c, int argc, char **argv)
{"config-net", no_argument, NULL, 17 },
{"no-copy-routes", no_argument, NULL, 18 },
{"no-copy-addrs", no_argument, NULL, 19 },
+ {"fd-is-tap", no_argument, NULL, 20 },
{ 0 },
};
struct get_bound_ports_ns_arg ns_ports_arg = { .c = c };
@@ -1411,6 +1417,12 @@ void conf(struct ctx *c, int argc, char **argv)
warn("--no-copy-addrs will be dropped soon");
c->no_copy_addrs = copy_addrs_opt = true;
break;
+ case 20:
+ if (c->mode != MODE_PASST)
+ die("--fd-is-tap is for passt mode only");
+
+ c->fd_tap_is_socket = false;
+ break;
case 'd':
if (c->debug)
die("Multiple --debug options given");
diff --git a/passt.c b/passt.c
index 8ddd9b3..b7276ff 100644
--- a/passt.c
+++ b/passt.c
@@ -195,9 +195,11 @@ int main(int argc, char **argv)
}
c.mode = MODE_PASTA;
+ c.fd_tap_is_socket = false;
log_name = "pasta";
} else if (strstr(name, "passt")) {
c.mode = MODE_PASST;
+ c.fd_tap_is_socket = true;
log_name = "passt";
} else {
exit(EXIT_FAILURE);
diff --git a/passt.h b/passt.h
index 282bd1a..2079cd0 100644
--- a/passt.h
+++ b/passt.h
@@ -264,6 +264,7 @@ struct ctx {
int epollfd;
int fd_tap_listen;
int fd_tap;
+ bool fd_tap_is_socket;
unsigned char mac[ETH_ALEN];
unsigned char mac_guest[ETH_ALEN];
diff --git a/tap.c b/tap.c
index 93db989..12b66ca 100644
--- a/tap.c
+++ b/tap.c
@@ -76,7 +76,7 @@ int tap_send(const struct ctx *c, const void *data, size_t len)
{
pcap(data, len);
- if (c->mode == MODE_PASST) {
+ if (c->fd_tap_is_socket) {
int flags = MSG_NOSIGNAL | MSG_DONTWAIT;
uint32_t vnet_len = htonl(len);
@@ -421,7 +421,7 @@ void tap_send_frames(struct ctx *c, const struct iovec *iov, size_t n)
if (!n)
return;
- if (c->mode == MODE_PASST)
+ if (c->fd_tap_is_socket)
m = tap_send_frames_passt(c, iov, n);
else
m = tap_send_frames_pasta(c, iov, n);
@@ -1176,6 +1176,7 @@ void tap_listen_handler(struct ctx *c, uint32_t events)
}
c->fd_tap = accept4(c->fd_tap_listen, NULL, NULL, 0);
+ c->fd_tap_is_socket = true;
if (!getsockopt(c->fd_tap, SOL_SOCKET, SO_PEERCRED, &ucred, &len))
info("accepted connection from PID %i", ucred.pid);
@@ -1225,6 +1226,7 @@ static int tap_ns_tun(void *arg)
die("Tap device opened but no network interface found");
c->fd_tap = fd;
+ c->fd_tap_is_socket = false;
return 0;
}
@@ -1273,7 +1275,7 @@ void tap_sock_init(struct ctx *c)
ASSERT(c->one_off);
ref.fd = c->fd_tap;
- if (c->mode == MODE_PASST)
+ if (c->fd_tap_is_socket)
ref.type = EPOLL_TYPE_TAP_PASST;
else
ref.type = EPOLL_TYPE_TAP_PASTA;
diff --git a/tap.h b/tap.h
index 021fb7c..3626e49 100644
--- a/tap.h
+++ b/tap.h
@@ -20,7 +20,7 @@ struct tap_hdr {
static inline size_t tap_hdr_len_(const struct ctx *c)
{
- if (c->mode == MODE_PASST)
+ if (c->fd_tap_is_socket)
return sizeof(struct tap_hdr);
else
return sizeof(struct ethhdr);
@@ -52,7 +52,7 @@ static inline void *tap_iov_base(const struct ctx *c, struct tap_hdr *taph)
static inline size_t tap_iov_len(const struct ctx *c, struct tap_hdr *taph,
size_t plen)
{
- if (c->mode == MODE_PASST)
+ if (c->fd_tap_is_socket)
taph->vnet_len = htonl(plen + sizeof(taph->eh));
return plen + tap_hdr_len_(c);
}
--
2.39.2 (Apple Git-144)
[View Less]
The regular expression I used when relicensing to GPLv2+ missed this.
Fixes: ca2749e1bd52 ("passt: Relicense to GPL 2.0, or any later version")
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
util.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/util.h b/util.h
index e4db33a..195023f 100644
--- a/util.h
+++ b/util.h
@@ -15,8 +15,8 @@
#define VERSION_BLOB \
VERSION "\n" \
"Copyright Red Hat\n" \
- "GNU Affero …
[View More]GPL version 3 or later " \
- "<https://www.gnu.org/licenses/agpl-3.0.html>\n" \
+ "GNU General Public License, version 2 or later\n" \
+ " <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>\n" \
"This is free software: you are free to change and redistribute it.\n" \
"There is NO WARRANTY, to the extent permitted by law.\n\n"
--
2.39.2
[View Less]
In the course of investigating bug 68, I discovered a number of pretty
serious bugs in how we handle various cases in tcp_tap_handler() and
tcp_data_from_tap(). This series fixes a number of them.
Note that while I'm pretty sure the bugs fixed here are real, I
haven't yet positively traced how they lead to the symptoms in bug 68
- I'm still waiting on the results from some special instrumentation
to track that down.
Link: https://bugs.passt.top/show_bug.cgi?id=68
David Gibson (8):
tcp, …
[View More]tap: Correctly advance through packets in tcp_tap_handler()
udp, tap: Correctly advance through packets in udp_tap_handler()
tcp: Remove some redundant packet_get() operations
tcp: Never hash match closed connections
tcp: Return consumed packet count from tcp_data_from_tap()
tcp: Correctly handle RST followed rapidly by SYN
tcp: Consolidate paths where we initiate reset on tap interface
tcp: Correct handling of FIN,ACK followed by SYN
tap.c | 29 ++++++++++--------
tcp.c | 98 +++++++++++++++++++++++++++++++----------------------------
tcp.h | 2 +-
udp.c | 15 ++++-----
udp.h | 2 +-
5 files changed, 78 insertions(+), 68 deletions(-)
--
2.41.0
[View Less]
l3_len was calculated from the ethernet frame size, and it
was assumed to be equal to the length stored in an IP packet.
But if the ethernet frame is padded, then l3_len calculated
that way can only be used as a bound check to validate the
length stored in an IP header. It should not be used for
calculating the l4_len.
This patch makes sure the small padded ethernet frames are
properly processed, by trusting the length stored in an IP
header.
Signed-off-by: Stas Sergeev <stsp2(a)yandex.ru&…
[View More]gt;
CC: Stefano Brivio <sbrivio(a)redhat.com>
---
tap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tap.c b/tap.c
index ee79be0..8d7859c 100644
--- a/tap.c
+++ b/tap.c
@@ -615,7 +615,7 @@ resume:
continue;
hlen = iph->ihl * 4UL;
- if (hlen < sizeof(*iph) || htons(iph->tot_len) != l3_len ||
+ if (hlen < sizeof(*iph) || htons(iph->tot_len) > l3_len ||
hlen > l3_len)
continue;
@@ -623,7 +623,7 @@ resume:
if (tap4_is_fragment(iph, now))
continue;
- l4_len = l3_len - hlen;
+ l4_len = htons(iph->tot_len) - hlen;
if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr)
c->ip4.addr_seen.s_addr = iph->saddr;
--
2.40.1
[View Less]
This is a second draft of the first steps in implementing more general
"connection" tracking, as described at:
https://pad.passt.top/p/NewForwardingModel
This series changes the TCP connection table into a more general flow
table that can track other protocols as well (although none are
implemented yet). Each flow uniformly keeps track of all the relevant
addresses and ports, which will allow for more robust control of NAT
and port forwarding.
Caveats:
* We significantly increase the …
[View More]size of a connection/flow entry
- Can probably be mitigated, but I haven't investigated much yet
* We perform a number of extra getsockname() calls to know some of
the socket endpoints
- Haven't yet measured how much performance impact that has
- Can be mitigated in at least some cases, but again, haven't
tried yet
* Only TCP converted so far
Changes since v1:
* Terminology changes
- "Endpoint" address/port instead of "correspondent" address/port
- "flowside" instead of "demiflow"
* Actually move the connection table to a new flow table structure in
new files
* Significant rearrangement of earlier patchs on top of that new
table, to reduce churn
David Gibson (10):
flow, tcp: Generalise connection types
flow, tcp: Move TCP connection table to unified flow table
flow, tcp: Consolidate flow pointer<->index helpers
flow: Make unified version of flow table compaction
flow: Introduce struct flowside, space for uniform tracking of
addresses
tcp: Move guest side address tracking to flow/flowside
tcp, flow: Perform TCP hash calculations based on flowside
tcp: Re-use flowside_hash for initial sequence number generation
tcp: Maintain host flowside for connections
tcp_splice: Fill out flowside information for spliced connections
Makefile | 14 +-
flow.c | 111 ++++++++++++++++
flow.h | 115 +++++++++++++++++
flow_table.h | 45 +++++++
passt.h | 3 +
siphash.c | 1 +
tcp.c | 355 ++++++++++++++++++++++++---------------------------
tcp.h | 5 -
tcp_conn.h | 54 ++------
tcp_splice.c | 78 ++++++-----
tcp_splice.h | 3 +-
11 files changed, 505 insertions(+), 279 deletions(-)
create mode 100644 flow.c
create mode 100644 flow.h
create mode 100644 flow_table.h
--
2.41.0
[View Less]
The hard link trick didn't actually fix the issue with SELinux file
contexts properly: as opposed to symbolic links, SELinux now
correctly associates types to the labels that are set -- except that
those labels are now shared, so we can end up (depending on how
rpm(8) extracts the archives) with /usr/bin/passt having a
pasta_exec_t context.
This got rather confusing as running restorecon(8) seemed to fix up
labels -- but that's simply toggling between passt_exec_t and
pasta_exec_t for both …
[View More]links, because each invocation will just "fix"
the file with the mismatching context.
Replace the hard links with copies. AppArmor's attachment, instead,
works with hard links, and if there's no LSM, we can keep symbolic
links, so keep symbolic links in the Makefile.
With copies, rpmbuild(8) will warn about duplicate Build-IDs in the
same package. Mangle them in pasta binaries by summing one to the
last byte, modulo one byte, using xxd (provided by vim-common) and
disable the automatic rehashing by find-debuginfo(1) -- we already
have per-release Build-IDs thanks to $VERSION passed on 'make'.
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
contrib/fedora/passt.spec | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)
diff --git a/contrib/fedora/passt.spec b/contrib/fedora/passt.spec
index d0c6895..51bf5a8 100644
--- a/contrib/fedora/passt.spec
+++ b/contrib/fedora/passt.spec
@@ -9,6 +9,10 @@
%global git_hash {{{ git_head }}}
%global selinuxtype targeted
+# Different Build-IDs for passt and pasta: don't let find-debuginfo touch them
+%undefine _unique_build_ids
+%global _no_recompute_build_ids 1
+
Name: passt
Version: {{{ git_version }}}
@@ -19,7 +23,7 @@ Group: System Environment/Daemons
URL: https://passt.top/
Source: https://passt.top/passt/snapshot/passt-%{git_hash}.tar.xz
-BuildRequires: gcc, make, checkpolicy, selinux-policy-devel
+BuildRequires: gcc, make, checkpolicy, selinux-policy-devel, binutils, vim-common
Requires: (%{name}-selinux = %{version}-%{release} if selinux-policy-%{selinuxtype})
%description
@@ -56,15 +60,28 @@ This package adds SELinux enforcement to passt(1) and pasta(1).
%install
%make_install DESTDIR=%{buildroot} prefix=%{_prefix} bindir=%{_bindir} mandir=%{_mandir} docdir=%{_docdir}/%{name}
-# The Makefile creates symbolic links for pasta, but we need hard links for
+# The Makefile creates symbolic links for pasta, but we need actual copies for
# SELinux file contexts to work as intended. Same with pasta.avx2 if present.
-ln -f %{buildroot}%{_bindir}/passt %{buildroot}%{_bindir}/pasta
+#
+# To avoid duplicate Build-IDs in the same package, we increase the last byte of
+# the value for pasta binaries by one (modulo one byte). Note that we already
+# have differentiated Build-IDs per release, courtesy of $VERSION, so we don't
+# need find-debuginfo(1) to recalculate them.
+rm %{buildroot}%{_bindir}/pasta
+objcopy --dump-section .note.gnu.build-id=%{buildroot}/build_id %{buildroot}%{_bindir}/passt
+printf '\x'$(printf %02x $(( ( 0x$(xxd -ps -s 35 %{buildroot}/build_id) + 1 ) % 0xff )) ) | dd of=%{buildroot}/build_id seek=35 bs=1 count=1 conv=notrunc
+objcopy --update-section .note.gnu.build-id=%{buildroot}/build_id %{buildroot}%{_bindir}/passt %{buildroot}%{_bindir}/pasta
+rm %{buildroot}/build_id
+
%ifarch x86_64
-ln -f %{buildroot}%{_bindir}/passt.avx2 %{buildroot}%{_bindir}/pasta.avx2
+rm %{buildroot}%{_bindir}/pasta.avx2
+objcopy --dump-section .note.gnu.build-id=%{buildroot}/build_id %{buildroot}%{_bindir}/passt.avx2
+printf '\x'$(printf %02x $(( ( 0x$(xxd -ps -s 35 %{buildroot}/build_id) + 1 ) % 0xff )) ) | dd of=%{buildroot}/build_id seek=35 bs=1 count=1 conv=notrunc
+objcopy --update-section .note.gnu.build-id=%{buildroot}/build_id %{buildroot}%{_bindir}/passt.avx2 %{buildroot}%{_bindir}/pasta.avx2
+rm %{buildroot}/build_id
ln -sr %{buildroot}%{_mandir}/man1/passt.1 %{buildroot}%{_mandir}/man1/passt.avx2.1
ln -sr %{buildroot}%{_mandir}/man1/pasta.1 %{buildroot}%{_mandir}/man1/pasta.avx2.1
-install -p -m 755 %{buildroot}%{_bindir}/passt.avx2 %{buildroot}%{_bindir}/pasta.avx2
%endif
pushd contrib/selinux
--
2.39.2
[View Less]