Problem: when passt/pasta are working in a broadcast domain with more
than one host machine, it will answer for all of these machines,
except for the one having --address. This is akin to ARP spoofing
and breaks connection with these machines if passt/pasta ARP reply
arrives before the original one.
Solution: only be responsible and send ARP replies
for the --gateway's address.
---
arp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arp.c b/arp.c
index a35c1b6..…
[View More]f873491 100644
--- a/arp.c
+++ b/arp.c
@@ -67,8 +67,8 @@ int arp(const struct ctx *c, const struct pool *p)
!memcmp(am->sip, am->tip, sizeof(am->sip)))
return 1;
- /* Don't resolve our own address, either. */
- if (!memcmp(am->tip, &c->ip4.addr, sizeof(am->tip)))
+ /* Don't resolve anything but gateway address. */
+ if (memcmp(am->tip, &c->ip4.gw, sizeof(am->tip)) != 0)
return 1;
ah->ar_op = htons(ARPOP_REPLY);
--
2.39.2 (Apple Git-144)
[View Less]
The fundamental patch here is 3/5, which is a workaround for a rather
surprising kernel behaviour we seem to be hitting. This all comes from
the investigation around https://bugs.passt.top/show_bug.cgi?id=74.
I can't hit stalls anymore and throughput looks finally good to me
(~3.5gbps with 208 KiB rmem_max and wmem_max), but... please test.
Stefano Brivio (5):
tcp: Fix comment to tcp_sock_consume()
tcp: Reset STALLED flag on ACK only, check for pending socket data
tcp: Force …
[View More]TCP_WINDOW_CLAMP before resetting STALLED flag
tcp, tap: Don't increase tap-side sequence counter for dropped frames
passt.1: Add note about tuning rmem_max and wmem_max for throughput
passt.1 | 33 +++++++++++++++++++++++++
tap.c | 10 +++++---
tap.h | 2 +-
tcp.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++----------
4 files changed, 102 insertions(+), 17 deletions(-)
--
2.39.2
[View Less]
cppcheck 2.12 (which Fedora 38 has updated, for one) introduces a
number of new warnings. Unfortunately, at least one of these is a
clear bug in cppcheck.
This series fixes a number of the new warnings reported in passt
(patches 1..3) and works around the remaining cppcheck bug (patch 4).
I'm pretty confident that patches 1 & 2 are safe and beneficial to
apply regardless of which cppcheck we're using.
Patch 3 is a little more dubious, because it potentially increases the
cppcheck runtime.…
[View More] On my system it doesn't seem to make a significant
difference, but that might not always stay true.
Patch 4 is a tricky one. It applies a specific suppression to work
around the cppcheck bug. That's necessary to get a pass with the
currently available cppcheck. However, it's ugly and we'd like to
remove it once the bug is fixed, but have no obvious way to remind us
to do that. What we want to do here kind of depends how long it takes
the bug to be fixed, which isn't clear at the moment.
David Gibson (4):
cppcheck: Make many pointers const
conf: Remove overly cryptic selection of forward table
cppcheck: Use "exhaustive" level checking when available
cppcheck: Work around bug in cppcheck 2.12.0
Makefile | 6 ++++++
conf.c | 25 +++++++++++--------------
isolation.c | 2 +-
isolation.h | 2 +-
log.c | 6 +++---
netlink.c | 6 +++---
netlink.h | 6 +++---
pasta.c | 4 ++--
pasta.h | 2 +-
tap.c | 14 +++++++-------
tap.h | 6 +++---
tcp.c | 25 ++++++++++++++++---------
tcp_conn.h | 2 +-
tcp_splice.c | 5 +++--
tcp_splice.h | 3 ++-
udp.c | 4 ++--
udp.h | 2 +-
util.c | 5 +++--
util.h | 4 ++--
19 files changed, 71 insertions(+), 58 deletions(-)
--
2.41.0
[View Less]
The fundamental patch here is 2/3, which is a workaround for a rather
surprising kernel behaviour we seem to be hitting. This all comes from
the investigation around https://bugs.passt.top/show_bug.cgi?id=74.
I can't hit stalls anymore and throughput looks finally good to me
(~3.5gbps with 208 KiB rmem_max and wmem_max), but... please test
(again).
v2:
- Drop 1/5 (checking for ACK before resetting STALLED and calling
tcp_data_from_sock() directly from tcp_tap_handler())
- Moving reset of …
[View More]STALLED flag is now done in 2/3 (was 3/5)
- Don't pass unnecessary argument to tcp_data_to_tap() in 3/3 (was
4/5)
- Drop 5/5 as long as we're not sure what kind of buffer clamping
is actually beneficial
Stefano Brivio (3):
tcp: Fix comment to tcp_sock_consume()
tcp: Force TCP_WINDOW_CLAMP before resetting STALLED flag
tcp, tap: Don't increase tap-side sequence counter for dropped frames
tap.c | 10 ++++++---
tap.h | 2 +-
tcp.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++----------
3 files changed, 67 insertions(+), 16 deletions(-)
--
2.39.2
[View Less]
While working on unifying the hashing for the flow table, I noticed
some awkwardness in the siphash functions. While looking into that I
noticed some bugs. So.. here we are.
Changes since v1:
* Don't accidentally increase the alignment of union inany_addr
David Gibson (10):
siphash: Make siphash functions consistently return 64-bit results
siphash: Make sip round calculations an inline function rather than
macro
siphash: Add siphash_feed() helper
siphash: Clean up hash …
[View More]finalisation with posthash_final() function
siphash: Fix bug in state initialisation
siphash: Use more hygienic state initialiser
siphash: Use specific structure for internal state
siphash: Make internal helpers public
siphash, checksum: Move TBAA explanation to checksum.c
siphash: Use incremental rather than all-at-once siphash functions
Makefile | 2 +-
checksum.c | 19 ++--
inany.h | 17 +++-
siphash.c | 243 ---------------------------------------------------
siphash.h | 119 +++++++++++++++++++++++--
tcp.c | 37 +++-----
tcp_splice.c | 1 +
7 files changed, 159 insertions(+), 279 deletions(-)
delete mode 100644 siphash.c
--
2.41.0
[View Less]
While working on unifying the hashing for the flow table, I noticed
some awkwardness in the siphash functions. While looking into that I
noticed some bugs. So.. here we are.
David Gibson (10):
siphash: Make siphash functions consistently return 64-bit results
siphash: Make sip round calculations an inline function rather than
macro
siphash: Add siphash_feed() helper
siphash: Clean up hash finalisation with posthash_final() function
siphash: Fix bug in state initialisation
…
[View More]siphash: Use more hygienic state initialiser
siphash: Use specific structure for internal state
siphash: Make internal helpers public
siphash, checksum: Move TBAA explanation to checksum.c
siphash: Use incremental rather than all-at-once siphash functions
Makefile | 2 +-
checksum.c | 19 ++--
inany.h | 16 +++-
siphash.c | 243 ---------------------------------------------------
siphash.h | 119 +++++++++++++++++++++++--
tcp.c | 37 +++-----
tcp_splice.c | 1 +
7 files changed, 158 insertions(+), 279 deletions(-)
delete mode 100644 siphash.c
--
2.41.0
[View Less]
We already had a couple of places we were working around clang-tidy
issue 58992, and the flow table series adds more. I got sick of ugly
inlines every time we used a syscall which returns a socket address,
so wrote a patch to consolidate the workarounds in one place.
However, that patch added an include of <string.h> to util.h which
exposed a classic C library gotcha in packet.c, so I fixed that too.
Changes since v1:
* Updated missed comment to match code changes in 1/2
* Fixed more …
[View More]places which shadowed index(3)
David Gibson (2):
Avoid shadowing index(3)
util: Consolidate and improve workarounds for clang-tidy issue 58992
Makefile | 2 +-
icmp.c | 5 -----
packet.c | 30 +++++++++++++++---------------
packet.h | 10 +++++-----
tcp.c | 22 ++++++++--------------
tcp_splice.c | 2 +-
util.c | 12 ++++++------
util.h | 43 ++++++++++++++++++++++++++++++++++++++++++-
8 files changed, 78 insertions(+), 48 deletions(-)
--
2.41.0
[View Less]
The reporter is running a SMTP server behind pasta, and the client
waits for the server's banner before sending any data. In turn, the
server waits for our ACK after sending SYN,ACK, which never comes.
If we use the ACK_IF_NEEDED indication to tcp_send_flag(), given that
there's no pending data, we delay sending the ACK segment at the end
of the three-way handshake until we have some data to send to the
server.
This was actually intended, as I thought we would lower the latency
for new …
[View More]connections, but we can't assume that the client will start
sending data first (SMTP is the typical example where this doesn't
happen).
And, trying out this patch with SSH (where the client starts sending
data first), the reporter actually noticed we have a lower latency
by forcing an ACK right away. Comparing a capture before the patch:
13:07:14.007704 IP 10.1.2.1.42056 > 10.1.2.140.1234: Flags [S], seq 1797034836, win 65535, options [mss 4096,nop,wscale 7], length 0
13:07:14.007769 IP 10.1.2.140.1234 > 10.1.2.1.42056: Flags [S.], seq 2297052481, ack 1797034837, win 65480, options [mss 65480,nop,wscale 7], length 0
13:07:14.008462 IP 10.1.2.1.42056 > 10.1.2.140.1234: Flags [.], seq 1:22, ack 1, win 65535, length 21
13:07:14.008496 IP 10.1.2.140.1234 > 10.1.2.1.42056: Flags [.], ack 22, win 512, length 0
13:07:14.011799 IP 10.1.2.140.1234 > 10.1.2.1.42056: Flags [P.], seq 1:515, ack 22, win 512, length 514
and after:
13:10:26.165364 IP 10.1.2.1.59508 > 10.1.2.140.1234: Flags [S], seq 4165939595, win 65535, options [mss 4096,nop,wscale 7], length 0
13:10:26.165391 IP 10.1.2.140.1234 > 10.1.2.1.59508: Flags [S.], seq 985607380, ack 4165939596, win 65480, options [mss 65480,nop,wscale 7], length 0
13:10:26.165418 IP 10.1.2.1.59508 > 10.1.2.140.1234: Flags [.], ack 1, win 512, length 0
13:10:26.165683 IP 10.1.2.1.59508 > 10.1.2.140.1234: Flags [.], seq 1:22, ack 1, win 512, length 21
13:10:26.165698 IP 10.1.2.140.1234 > 10.1.2.1.59508: Flags [.], ack 22, win 512, length 0
13:10:26.167107 IP 10.1.2.140.1234 > 10.1.2.1.59508: Flags [P.], seq 1:515, ack 22, win 512, length 514
the latency between the initial SYN segment and the first data
transmission actually decreases from 792µs to 334µs. This is not
statistically relevant as we have a single measurement, but it can't
be that bad, either.
Reported-by: cr3bs (from IRC)
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
tcp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tcp.c b/tcp.c
index 76b7b8d..39844eb 100644
--- a/tcp.c
+++ b/tcp.c
@@ -2562,7 +2562,7 @@ static void tcp_conn_from_sock_finish(struct ctx *c, struct tcp_tap_conn *conn,
* dequeue waiting for SYN,ACK from tap -- check now.
*/
tcp_data_from_sock(c, conn);
- tcp_send_flag(c, conn, ACK_IF_NEEDED);
+ tcp_send_flag(c, conn, ACK);
}
/**
--
2.39.2
[View Less]
If we go over the flattened list of search domains and just replace
dots and zero bytes with the length of the next label to implement
the encoding specified by section 3.1 of RFC 1035, if there are
multiple domains in the search list, we'll also replace separators
between two domain names with the length of the first label of the
second domain, plus one. Those should remain as zero bytes to
separate domains, though.
To distinguish between label separators and domain names separators,
for …
[View More]simplicity, introduce a dot before the first label of every
domain we copy to form the list. All dots are then replaced by label
lengths, and separators (zero bytes) remain as they are.
As we do this, we need to make sure we don't replace the trailing
dot, if present: that's already a separator. Skip copying it, and
just add separators as needed.
Now that we don't copy those, though, we might end up with
zero-length domains: skip them, as they're meaningless anyway.
And as we might skip domains, we can't use the index 'i' to check if
we're at the beginning of the option -- use 'srch' instead.
This is very similar to how we prepare the list for NDP option 31,
except that we don't need padding (RFC 8106, 5.2) here, and we should
refactor this into common functions, but it probably makes sense to
rework the NDP responder (https://bugs.passt.top/show_bug.cgi?id=21)
first.
Reported-by: Sebastian Mitterle <smitterl(a)redhat.com>
Link: https://bugs.passt.top/show_bug.cgi?id=75
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
dhcpv6.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/dhcpv6.c b/dhcpv6.c
index fc42a84..58171bb 100644
--- a/dhcpv6.c
+++ b/dhcpv6.c
@@ -376,24 +376,34 @@ search:
return offset;
for (i = 0; *c->dns_search[i].n; i++) {
- if (!i) {
+ size_t name_len = strlen(c->dns_search[i].n);
+
+ /* We already append separators, don't duplicate if present */
+ if (c->dns_search[i].n[name_len - 1] == '.')
+ name_len--;
+
+ /* Skip root-only search domains */
+ if (!name_len)
+ continue;
+
+ if (!srch) {
srch = (struct opt_dns_search *)(buf + offset);
offset += sizeof(struct opt_hdr);
srch->hdr.t = OPT_DNS_SEARCH;
srch->hdr.l = 0;
p = srch->list;
- *p = 0;
}
- p = stpcpy(p + 1, c->dns_search[i].n);
- *(p++) = 0;
- srch->hdr.l += strlen(c->dns_search[i].n) + 2;
- offset += strlen(c->dns_search[i].n) + 2;
+ *p = '.';
+ p = stpncpy(p + 1, c->dns_search[i].n, name_len);
+ p++;
+ srch->hdr.l += name_len + 2;
+ offset += name_len + 2;
}
if (srch) {
for (i = 0; i < srch->hdr.l; i++) {
- if (srch->list[i] == '.' || !srch->list[i]) {
+ if (srch->list[i] == '.') {
srch->list[i] = strcspn(srch->list + i + 1,
".");
}
--
2.39.2
[View Less]
It turns out we never used 'clen' until commit 1f24d3efb499 ("dhcp:
support BOOTP clients"), and we always ignored option 55 (Parameter
Request List), while, according to RFC 2132, we MUST try to insert
the requested options in the order requested by the client.
The commit mentioned above made this visible because now every client
is reported as sending a DHCPREQUEST as an old BOOTP client, based on
the lack of option 53 (that is, zero length).
Fixes: b439984641ed ("merd: ARP and DHCP …
[View More]handlers, connection tracking fixes")
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>
---
dhcp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/dhcp.c b/dhcp.c
index c1ac95e..46c258e 100644
--- a/dhcp.c
+++ b/dhcp.c
@@ -317,6 +317,7 @@ int dhcp(const struct ctx *c, const struct pool *p)
return -1;
memcpy(&opts[*type].c, val, *olen);
+ opts[*type].clen = *olen;
opt_off += *olen + 2;
}
--
2.39.2
[View Less]