This version contains what I perceive as the least controversial parts of my previous RFC series. It basically makes address handling behave like before, but now allowing multiple addresses both at the host side and the guest side. v2: - Added the earlier standalone CIDR commit to the head of the series. - Replaced the guest namespace interface subscriptions with just an address observation feature, so that it works with both PASTA and PASST. - Unified 'no_copy_addrs' and 'copy_addrs' code paths, as suggested by David G. - Multiple other changes, also based on feedback from David. - Removed the host interface subscription patches, -for now. I intend to re-add them once this series is applied. - Outstanding question: When do we add an IPv4 link local address to the guest? Only in local/opaque mode? Only when explicitly requested? Always? v3: - Unified the IPv4 and IPv6 arrays into one array - Changed prefix_len to always be in IPv6/IpV4 mapped format - Updated migration protocol to v3, handling multiple addresses - Many other smaller changes, based on feedback from the PASST team v4: - Numerous changes based on feedback - Added several new commits, mostly broken out of the pre-existing ones. Jon Maloy (12): ip: Introduce unified multi-address data structures ip: Introduce for_each_addr() macro for address iteration fwd: Unify guest accessibility checks with unified address array arp: Check all configured addresses in ARP filtering pasta: Extract pasta_ns_conf_ip4/6() to reduce nesting netlink: Return prefix length for IPv6 addresses in nl_addr_get() conf: Allow multiple -a/--address options per address family ip: Track observed guest IPv4 addresses in unified address array ip: Track observed guest IPv6 addresses in unified address array fwd: Unify fwd_set_observed_ip4() and fwd_set_observed_ip6() migrate: Rename v1 address functions to v2 for clarity migrate: Update protocol to v3 for multi-address support arp.c | 15 +++- conf.c | 147 +++++++++++++++++++++--------------- conf.h | 8 ++ dhcp.c | 13 +++- dhcpv6.c | 11 ++- dhcpv6.h | 2 +- fwd.c | 221 ++++++++++++++++++++++++++++++++++++++---------------- fwd.h | 3 + inany.h | 3 + ip.h | 5 ++ migrate.c | 185 ++++++++++++++++++++++++++++++++++++++++----- ndp.c | 17 ++++- netlink.c | 4 +- passt.h | 104 +++++++++++++++++++++---- pasta.c | 200 ++++++++++++++++++++++++++++-------------------- tap.c | 72 +++++++++++++----- 16 files changed, 736 insertions(+), 274 deletions(-) -- 2.52.0
As preparation for supporting multiple addresses per interface, we
replace the single addr/prefix_len fields with an array. The
array consists of a new struct inany_addr_entry containing an
address and prefix length, both in inany_addr format.
Despite some necessary code refactoring, there are only two
functional changes:
- The indicated IPv6 prefix length is now properly stored, instead
of being ignored and overridden with the hardcoded value 64, as
as has been the case until now.
- Since even IPv4 addresses now are stored in IPv6 format, we
also store the corresponding prefix length in that format,
i.e. using the range [96,128] instead of [0,32].
Signed-off-by: Jon Maloy
Add the for_each_addr() macro to iterate over addresses in the unified
array. The macro supports an address family filter parameter (AF_INET,
AF_INET6, or 0 for all) using a _next_addr_idx() helper function to
skip non-matching entries.
Signed-off-by: Jon Maloy
We replace the fwd_guest_accessible4() and fwd_guest_accessible6()
functions with a unified fwd_guest_accessible() function that handles
both address families. With the unified address array, we can check
all configured addresses in a single pass using for_each_addr() with
family filter INADDR_UNSPEC (== 0).
Signed-off-by: Jon Maloy
As a preparation for handling multiple addresses, we update ignore_arp()
to check against all addresses in the unified addrs[] array using the
for_each_addr() macro.
Signed-off-by: Jon Maloy
Extract the IPv4 and IPv6 namespace configuration code from
pasta_ns_conf() into separate static functions. This reduces
indentation depth and prepares for adding multi-address support.
No functional change.
Signed-off-by: Jon Maloy
nl_addr_get() was not setting the prefix_len output parameter for
IPv6 addresses, only for IPv4. This meant callers always got 0 for
IPv6, forcing them to use a hardcoded default (64).
Fix by assigning *prefix_len in the IPv6 case, matching the IPv4
behavior.
Signed-off-by: Jon Maloy
Allow specifying multiple addresses per family with -a/--address.
The first address of each family is used for DHCP/DHCPv6 assignment.
Signed-off-by: Jon Maloy
Create a common fwd_set_observed() function that handles both IPv4 and
IPv6 observed addresses. The function determines the address family from
the input and uses the appropriate reserved position (0 for IPv4, 1 for
IPv6) for O(1) lookup.
Call sites are updated to use fwd_set_observed() directly with
union inany_addr.
This reduces code duplication and ensures consistent behavior between
IPv4 and IPv6 address tracking.
Signed-off-by: Jon Maloy
We remove the addr_seen field in struct ip4_ctx and replace it by
setting a new CONF_ADDR_OBSERVED flag in the corresponding entry
in the unified address array.
The observed IPv4 address is always put at position 0 in the array,
allowing for very fast lookup. Only one IPv4 address can have the
OBSERVED flag at a time.
Signed-off-by: Jon Maloy
Some migration address structures and functions have a _v1 suffix.
This is confusing, since they are currently handling version 2 of
the migration protocol. We are now going to introduce a new version
3 of the protocol, so we choose to give these functions the correct
suffix _v2 instead. This is in correspondence with current reality,
and will help make a clearer distinction between the old and the new
versions of those functions.
Signed-off-by: Jon Maloy
We remove the addr_seen and addr_ll_seen fields in struct ip6_ctx
and replace them by setting CONF_ADDR_OBSERVED and CONF_ADDR_LINKLOCAL
flags in the corresponding entry in the unified address array.
The observed IPv6 address is always kept at position 1 (position 0
is reserved for IPv4), allowing very fast lookup. Only one IPv6 address
can have the OBSERVED flag at a time.
A new fwd_set_observed_ip6() function handles observed IPv6 addresses,
mirroring the IPv4 fwd_set_observed_ip4() function. Both tap.c and
migrate.c now use this common function.
Signed-off-by: Jon Maloy
We update the migration protocol to version 3 to support distributing
multiple addresses from the unified address array. The new protocol
migrates all address entries in the array, along with their prefix
lengths and flags, and leaves it to the receiver to filter which
ones he wants to apply.
Signed-off-by: Jon Maloy
On 2026-02-17 17:18, Jon Maloy wrote:
We remove the addr_seen field in struct ip4_ctx and replace it by setting a new CONF_ADDR_OBSERVED flag in the corresponding entry in the unified address array.
The observed IPv4 address is always put at position 0 in the array, allowing for very fast lookup. Only one IPv4 address can have the OBSERVED flag at a time.
Signed-off-by: Jon Maloy
--- v4: - Removed migration protocol update, to be added in later commit - Allow only one OBSERVED address at a time - Some other changes based on feedback from David G --- conf.c | 2 - conf.h | 1 + fwd.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++------ fwd.h | 3 ++ migrate.c | 15 ++++++- passt.h | 2 - tap.c | 15 ++++++- 7 files changed, 133 insertions(+), 19 deletions(-)
diff --git a/conf.c b/conf.c index ca0a764..0172dcd 100644 --- a/conf.c +++ b/conf.c @@ -738,7 +738,6 @@ static unsigned int conf_ip4(struct ctx *c, unsigned int ifi) e->addr = inany_from_v4(addr); e->prefix_len = prefix_len + 96; e->flags = CONF_ADDR_HOST; - ip4->addr_seen = addr; }
ip4->our_tap_addr = ip4->guest_gw; @@ -755,7 +754,6 @@ static void conf_ip4_local(struct ctx *c) struct inany_addr_entry *e = &c->addrs[c->addr_count++]; struct ip4_ctx *ip4 = &c->ip4;
- ip4->addr_seen = IP4_LL_GUEST_ADDR; ip4->our_tap_addr = ip4->guest_gw = IP4_LL_GUEST_GW; ip4->no_copy_addrs = ip4->no_copy_routes = true; e->addr = inany_from_v4(IP4_LL_GUEST_ADDR); diff --git a/conf.h b/conf.h index bfad36f..8b10ac6 100644 --- a/conf.h +++ b/conf.h @@ -12,6 +12,7 @@ #define CONF_ADDR_USER BIT(0) /* User set via -a */ #define CONF_ADDR_HOST BIT(1) /* From host interface */ #define CONF_ADDR_LINKLOCAL BIT(2) /* Link-local address */ +#define CONF_ADDR_OBSERVED BIT(3) /* Seen in guest traffic */
enum passt_modes conf_mode(int argc, char *argv[]); void conf(struct ctx *c, int argc, char **argv); diff --git a/fwd.c b/fwd.c index fa5d667..ca704c2 100644 --- a/fwd.c +++ b/fwd.c @@ -24,6 +24,7 @@ #include "ip.h" #include "fwd.h" #include "passt.h" +#include "conf.h" #include "lineread.h" #include "flow_table.h" #include "netlink.h" @@ -491,6 +492,85 @@ static bool is_dns_flow(uint8_t proto, const struct flowside *ini) ((ini->oport == 53) || (ini->oport == 853)); }
+/** + * fwd_guest_addr() - Get guest address matching criteria + * @c: Execution context + * @af: Address family (AF_INET, AF_INET6, or 0 for any) + * @incl: Flags that must be present (any-match) + * @excl: Flags that must not be present + * + * Return: first address matching criteria, or NULL + */ +const union inany_addr *fwd_guest_addr(const struct ctx *c, sa_family_t af, + uint8_t incl, uint8_t excl) +{ + const struct inany_addr_entry *e; + + for_each_addr(e, c, af) { + if (incl && !(e->flags & incl)) + continue; + if (e->flags & excl) + continue; + return &e->addr; + } + + return NULL; +} + +/** + * fwd_set_observed_ip4() - Set observed IPv4 guest address + * @c: Execution context + * @addr: IPv4 address observed in guest traffic + * + * Mark @addr as the observed guest address. The observed address is always + * kept at position 0 for O(1) lookup. Only one address can have the OBSERVED + * flag at a time. + */ +void fwd_set_observed_ip4(struct ctx *c, const struct in_addr *addr) +{ + struct inany_addr_entry *e = &c->addrs[0]; + int i; + + if (!addr->s_addr) + return; + + /* Fast path: check if already observed at position 0 */ + if (c->addr_count > 0 && (e->flags & CONF_ADDR_OBSERVED) && + inany_equals4(&e->addr, addr)) + return; + + /* Slow path: new observed address - insert at position 0 */ + if (c->addr_count >= INANY_MAX_ADDRS) { + debug("Address table full, can't add observed IPv4"); + return; + } + + /* Make room and insert at position 0 */ + memmove(&c->addrs[1], e, c->addr_count * sizeof(*e)); + c->addr_count++; + inany_from_af(&e->addr, AF_INET, addr); + e->prefix_len = 0; + e->flags = CONF_ADDR_OBSERVED; + + /* Handle old observed IPv4 address, if any */ + for (i = 1; i < c->addr_count; i++) { + e = &c->addrs[i]; + + if (!inany_v4(&e->addr) || !(e->flags & CONF_ADDR_OBSERVED)) + continue; + + e->flags &= ~CONF_ADDR_OBSERVED; + + /* Remove if no other flags, or if addr is duplicate */ + if (!e->flags || inany_equals4(&e->addr, addr)) { + memmove(&c->addrs[i], &c->addrs[i + 1], + (c->addr_count - i - 1) * sizeof(*e)); + c->addr_count--; + } + break; + } +}
I did this on suggestion from David, but thinking more about this I find it is not optimal. There is no reason to limit the number of OBSERVED addresses to just one per protocol. If a guest a happens to switch frequencly between addresses this wil come with a measurable performance penalty, because w always will have to addthe current address at the head of the array and potentially remove another one. Alternative approach: We still add all new OBSERVED addresses to the tip of the array, but leave the others as they are. This means that all OBSERVED addresses will accumulate at the beginning, and all lookups will be very quick. Statistically, it is overwhelmingly likely there will be a hit already at position 0 or 1, and we really have lost nothing in comparison to the one-OBSERVED-address policy. /jon
+ /** * fwd_guest_accessible() - Is address guest-accessible * @c: Execution context @@ -515,17 +595,11 @@ static bool fwd_guest_accessible(const struct ctx *c, if (inany_is_unspecified4(addr)) return false;
- /* Check against all configured guest addresses */ + /* Check against all configured and observed guest addresses */ for_each_addr(e, c, 0) if (inany_equals(addr, &e->addr)) return false;
- /* Also check addr_seen: it tracks the address the guest is actually - * using, which may differ from configured addresses. - */ - if (inany_equals4(addr, &c->ip4.addr_seen)) - return false; - /* For IPv6, addr_seen starts unspecified, because we don't know what LL * address the guest will take until we see it. Only check against it * if it has been set to a real address. @@ -726,10 +800,23 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto, * match. */ if (inany_v4(&ini->eaddr)) { - if (c->host_lo_to_ns_lo) + if (c->host_lo_to_ns_lo) { tgt->eaddr = inany_loopback4; - else - tgt->eaddr = inany_from_v4(c->ip4.addr_seen); + } else { + const union inany_addr *guest_addr; + + guest_addr = fwd_guest_addr(c, AF_INET, + CONF_ADDR_OBSERVED, + 0); + if (!guest_addr) + guest_addr = fwd_guest_addr(c, AF_INET, + CONF_ADDR_USER | CONF_ADDR_HOST, + 0); + if (!guest_addr) + return PIF_NONE; + + tgt->eaddr = *guest_addr; + } tgt->oaddr = inany_any4; } else { if (c->host_lo_to_ns_lo) @@ -761,7 +848,12 @@ uint8_t fwd_nat_from_host(const struct ctx *c, uint8_t proto, tgt->oport = ini->eport;
if (inany_v4(&tgt->oaddr)) { - tgt->eaddr = inany_from_v4(c->ip4.addr_seen); + const union inany_addr *guest_addr; + + guest_addr = fwd_guest_addr(c, AF_INET, CONF_ADDR_OBSERVED, 0); + if (!guest_addr) + return PIF_NONE; + tgt->eaddr = *guest_addr; } else { if (inany_is_linklocal6(&tgt->oaddr)) tgt->eaddr.a6 = c->ip6.addr_ll_seen; diff --git a/fwd.h b/fwd.h index 7792582..38f4e60 100644 --- a/fwd.h +++ b/fwd.h @@ -15,6 +15,9 @@ struct flowside;
void fwd_probe_ephemeral(void); bool fwd_port_is_ephemeral(in_port_t port); +const union inany_addr *fwd_guest_addr(const struct ctx *c, sa_family_t af, + uint8_t incl, uint8_t excl); +void fwd_set_observed_ip4(struct ctx *c, const struct in_addr *addr);
enum fwd_ports_mode { FWD_UNSET = 0, diff --git a/migrate.c b/migrate.c index 48d63a0..d223857 100644 --- a/migrate.c +++ b/migrate.c @@ -18,6 +18,8 @@ #include "util.h" #include "ip.h" #include "passt.h" +#include "conf.h" +#include "fwd.h" #include "inany.h" #include "flow.h" #include "flow_table.h" @@ -57,11 +59,15 @@ static int seen_addrs_source_v1(struct ctx *c, struct migrate_seen_addrs_v1 addrs = { .addr6 = c->ip6.addr_seen, .addr6_ll = c->ip6.addr_ll_seen, - .addr4 = c->ip4.addr_seen, }; + const union inany_addr *obs4;
(void)stage;
+ obs4 = fwd_guest_addr(c, AF_INET, CONF_ADDR_OBSERVED, 0); + if (obs4) + addrs.addr4 = *inany_v4(obs4); + memcpy(addrs.mac, c->guest_mac, sizeof(addrs.mac));
if (write_all_buf(fd, &addrs, sizeof(addrs))) @@ -82,6 +88,7 @@ static int seen_addrs_target_v1(struct ctx *c, const struct migrate_stage *stage, int fd) { struct migrate_seen_addrs_v1 addrs; + struct in_addr addr4;
(void)stage;
@@ -90,7 +97,11 @@ static int seen_addrs_target_v1(struct ctx *c,
c->ip6.addr_seen = addrs.addr6; c->ip6.addr_ll_seen = addrs.addr6_ll; - c->ip4.addr_seen = addrs.addr4; + + /* Copy to avoid unaligned access from packed struct */ + addr4 = addrs.addr4; + fwd_set_observed_ip4(c, &addr4); + memcpy(c->guest_mac, addrs.mac, sizeof(c->guest_mac));
return 0; diff --git a/passt.h b/passt.h index 15d6596..fa747c6 100644 --- a/passt.h +++ b/passt.h @@ -78,7 +78,6 @@ struct inany_addr_entry {
/** * struct ip4_ctx - IPv4 execution context - * @addr_seen: Latest IPv4 address seen as source from tap * @guest_gw: IPv4 gateway as seen by the guest * @map_host_loopback: Outbound connections to this address are NATted to the * host's 127.0.0.1 @@ -94,7 +93,6 @@ struct inany_addr_entry { * @no_copy_addrs: Don't copy all addresses when configuring namespace */ struct ip4_ctx { - struct in_addr addr_seen; struct in_addr guest_gw; struct in_addr map_host_loopback; struct in_addr map_guest_addr; diff --git a/tap.c b/tap.c index 4298dcd..8c1ed35 100644 --- a/tap.c +++ b/tap.c @@ -48,6 +48,7 @@ #include "iov.h" #include "passt.h" #include "conf.h" +#include "fwd.h" #include "arp.h" #include "dhcp.h" #include "ndp.h" @@ -162,6 +163,16 @@ void tap_send_single(const struct ctx *c, const void *data, size_t l2len) } }
+/** + * tap_check_src_addr4() - Note an IPv4 address seen in guest traffic + * @c: Execution context + * @addr: IPv4 address seen as source from guest + */ +static void tap_check_src_addr4(struct ctx *c, const struct in_addr *addr) +{ + fwd_set_observed_ip4(c, addr); +} + /** * tap_ip6_daddr() - Normal IPv6 destination address for inbound packets * @c: Execution context @@ -772,8 +783,8 @@ resume: continue; }
- if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr) - c->ip4.addr_seen.s_addr = iph->saddr; + if (iph->saddr) + tap_check_src_addr4(c, (const struct in_addr *)&iph->saddr);
if (!iov_drop_header(&data, hlen)) continue;
participants (1)
-
Jon Maloy