[PATCH v3 00/11] Introduce multiple addresses
This version contains what I perceive as the least controversial parts of my previous RFC series. It basically makes address handling behave like before, but now allowing multiple addresses both at the host side and the guest side. v2: - Added the earlier standalone CIDR commit to the head of the series. - Replaced the guest namespace interface subscriptions with just an address observation feature, so that it works with both PASTA and PASST. - Unified 'no_copy_addrs' and 'copy_addrs' code paths, as suggested by David G. - Multiple other changes, also based on feedback from David. - Removed the host interface subscription patches, -for now. I intend to re-add them once this series is applied. - Outstanding question: When do we add an IPv4 link local address to the guest? Only in local/opaque mode? Only when explicitly requested? Always? v3: - Unified the IPv4 and IPv6 arrays into one array - Changed prefix_len to always eb in IPv6/IpV4 mapped format - Updated migration protocol to v3, handling multiple addresses - Many other smaller changes, based on feedback from the PASST team Jon Maloy (11): conf: Support CIDR notation for -a/--address option ip: Add IN4_MASK() macro for IPv4 netmask calculation ip: Introduce unified multi-address data structures fwd: Check all configured addresses in guest accessibility functions arp: Check all configured addresses in ARP filtering pasta: Extract pasta_ns_conf_ip4/6() to reduce nesting conf: Allow multiple -a/--address options per address family migrate: Rename v1 address functions to v2 for clarity ip: Track observed guest IPv4 addresses in unified address array ip: Track observed guest IPv6 addresses in unified address array conf: Select addresses for DHCP and NDP distribution arp.c | 17 ++-- conf.c | 249 ++++++++++++++++++++++++++++++++---------------------- conf.h | 6 ++ dhcp.c | 23 +++-- dhcp.h | 2 +- dhcpv6.c | 23 +++-- dhcpv6.h | 2 +- fwd.c | 216 ++++++++++++++++++++++++++++++++++++++-------- fwd.h | 6 ++ inany.c | 68 +++++++++++++++ inany.h | 1 + ip.c | 22 +++++ ip.h | 9 ++ migrate.c | 244 ++++++++++++++++++++++++++++++++++++++++++++++++---- ndp.c | 19 +++-- ndp.h | 4 +- passt.1 | 17 ++-- passt.h | 134 ++++++++++++++++++++++++++--- pasta.c | 197 ++++++++++++++++++++++++------------------ tap.c | 117 +++++++++++++++++++++---- 20 files changed, 1080 insertions(+), 296 deletions(-) -- 2.52.0
Extend the -a/--address option to accept addresses in CIDR notation
(e.g., 192.168.1.1/24 or 2001:db8::1/64) as an alternative to using
separate -a and -n options.
We add a new inany_prefix_pton() helper function that:
- Parses address strings with optional /prefix_len suffix
- Validates prefix length based on address family (0-32 for IPv4,
0-128 for IPv6), including handling of IPv4-to-IPv6 mapping case.
- Returns identified address family, if any.
For IPv4, the prefix length is stored in ip4.prefix_len when provided.
Mixing -n and CIDR notation results in an error to catch likely user
mistakes.
Also fix a bug in conf_ip4_prefix() that was incorrectly using the
global 'optarg' instead of its 'arg' parameter.
Signed-off-by: Jon Maloy
Add a convenience macro to compute an IPv4 netmask from a prefix length.
This simplifies netmask calculations throughout the codebase.
Signed-off-by: Jon Maloy
As a preparation for handling multiple addresses, we update
fwd_guest_accessible4() and fwd_guest_accessible6() to check
against all addresses in the unified addrs[] array using the
for_each_addr() macro.
This ensures that when multiple addresses are configured via -a options,
inbound traffic for any of them is correctly detected as having no valid
forwarding path, and subsequently dropped. This occurs when a peer
address collides with an address the guest is using, and we have no
translation for it.
Signed-off-by: Jon Maloy
As preparation for supporting multiple addresses per interface, we
replace the single addr/prefix_len fields with an array. The
array consists of a new struct inany_addr_entry containing an
address and prefix length, both in inany_addr format.
There are only two functional changes:
- The indicated IPv6 prefix length is now properly stored, instead
of being ignored and overridden with the hardcoded value 64, as
as has been the case until now.
- Since even IPv4 addresses now are stored in IPv6 format, we
also store the corresponding prefix length in that format,
i.e. using the range [96,128] instead of [0,32].
Signed-off-by: Jon Maloy
As a preparation for handling multiple addresses, we update ignore_arp()
to check against all addresses in the unified addrs[] array using the
for_each_addr() macro.
Signed-off-by: Jon Maloy
Extract the IPv4 and IPv6 namespace configuration code from
pasta_ns_conf() into separate static functions. This reduces
indentation depth and prepares for adding multi-address support.
No functional change.
Signed-off-by: Jon Maloy
We enable configuration of multiple IPv4 and IPv6 addresses by allowing
repeated use of the -a/--address option.
- We update option parsing to append addresses to the unified addrs[]
array, with limit checks for IP4_MAX_ADDRS and IP6_MAX_ADDRS.
- Each address specified via -a, but with no prefix length indicated,
gets a class-based default prefix length.
- If no -a option is given, addresses/prefix lengths are inherited from
the template interface.
- If a prefix length is to be added, it has to be done in CIDR format,
except for the very first address.
- We configure all indicated addresses in the namespace interface using
the for_each_addr() macro.
Signed-off-by: Jon Maloy
Some migration address structures and functions have a _v1 suffix.
This is confusing, since they are currently handling version 2 of
the migration protocol. We are now going to introduce a new version
3 of the protocol, so we choose to give these functions the correct
suffix _v2 instead. This is in correspondence with current reality,
and will help make a clearer distinction between the old and the new
versions of those functions.
Signed-off-by: Jon Maloy
We remove the addr_seen field in struct ip4_ctx and replace it by
setting a new INANY_ADDR_OBSERVED flag in the corresponding entry in
the unified address array. If the seen address is not present in the
array we add it first.
We also update the migration protocol to v3, so that it can distribute
multiple addresses from the unified array.
Signed-off-by: Jon Maloy
As a part of the multiple address support feature, we add a new
algorithm for DHCP/DHCPv6/NDP address selection and refactor the
conf_print() function accordingly.
The fwd_first_usable_addr() and fwd_select_addr() functions now take
a context pointer and address family parameter to work with the
unified address array using for_each_addr().
Signed-off-by: Jon Maloy
We remove the addr_seen and addr_ll_seen fields in struct ip6_ctx
and replace them by setting INANY_ADDR_OBSERVED and INANY_ADDR_LINKLOCAL
flags in the corresponding entry in the unified address array.
If the seen address is not present in the array we add it first.
This completes the unification of address storage for both IPv4 and
IPv6, enabling future support for multiple guest addresses per family.
Signed-off-by: Jon Maloy
On Fri, Jan 30, 2026 at 04:44:37PM -0500, Jon Maloy wrote:
Extend the -a/--address option to accept addresses in CIDR notation (e.g., 192.168.1.1/24 or 2001:db8::1/64) as an alternative to using separate -a and -n options.
We add a new inany_prefix_pton() helper function that: - Parses address strings with optional /prefix_len suffix - Validates prefix length based on address family (0-32 for IPv4, 0-128 for IPv6), including handling of IPv4-to-IPv6 mapping case. - Returns identified address family, if any.
For IPv4, the prefix length is stored in ip4.prefix_len when provided. Mixing -n and CIDR notation results in an error to catch likely user mistakes.
Also fix a bug in conf_ip4_prefix() that was incorrectly using the global 'optarg' instead of its 'arg' parameter.
Signed-off-by: Jon Maloy
--- v3: Fixes after feedback from Laurent, David and Stefano Notably, updated man page for the -a option
v4: Fixes based on feedback from David G: - Handling prefix length adjustment when IPv4-to-IPv6 mapping - Removed redundant !IN6_IS_ADDR_V4MAPPED(&addr.a6) test - Simplified tests of acceptable address types - Merged documentation and code commits - Some documentation text clarifications
v5: - Moved address/prefix parsing into a refactored inany_prefix_pton() function. - inany_prefix_pton() now only caluclates IPv6 style prefix lengths - Stricter distinction between error causes. - Some refactoring of the 'case a:' branch in conf() - Some small fixes in passt.1 --- conf.c | 58 +++++++++++++++++++++++++++++------------------- inany.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ inany.h | 1 + ip.c | 21 ++++++++++++++++++ ip.h | 2 ++ passt.1 | 17 ++++++++++----- 6 files changed, 139 insertions(+), 28 deletions(-)
diff --git a/conf.c b/conf.c index 2942c8c..98d5d17 100644 --- a/conf.c +++ b/conf.c @@ -682,7 +682,7 @@ static int conf_ip4_prefix(const char *arg) return -1; } else { errno = 0; - len = strtoul(optarg, NULL, 0); + len = strtoul(arg, NULL, 0); if (len > 32 || errno) return -1; } @@ -896,7 +896,7 @@ static void usage(const char *name, FILE *f, int status) " a zero value disables assignment\n" " default: 65520: maximum 802.3 MTU minus 802.3 header\n" " length, rounded to 32 bits (IPv4 words)\n" - " -a, --address ADDR Assign IPv4 or IPv6 address ADDR\n" + " -a, --address ADDR Assign IPv4 or IPv6 address ADDR[/PREFIXLEN]\n" " can be specified zero to two times (for IPv4 and IPv6)\n" " default: use addresses from interface with default route\n" " -n, --netmask MASK Assign IPv4 MASK, dot-decimal or bits\n" @@ -1498,6 +1498,7 @@ void conf(struct ctx *c, int argc, char **argv) const char *optstring = "+dqfel:hs:F:I:p:P:m:a:n:M:g:i:o:D:S:H:461t:u:T:U:"; const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt"; char userns[PATH_MAX] = { 0 }, netns[PATH_MAX] = { 0 }; + bool prefix_from_cidr = false, prefix_from_opt = false; bool copy_addrs_opt = false, copy_routes_opt = false; enum fwd_ports_mode fwd_default = FWD_NONE; bool v4_only = false, v6_only = false; @@ -1808,35 +1809,46 @@ void conf(struct ctx *c, int argc, char **argv) c->mtu = mtu; break; } - case 'a': - if (inet_pton(AF_INET6, optarg, &c->ip6.addr) && - !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr) && - !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr) && - !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr) && - !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr) && - !IN6_IS_ADDR_MULTICAST(&c->ip6.addr)) { - if (c->mode == MODE_PASTA) - c->ip6.no_copy_addrs = true; - break; - } - - if (inet_pton(AF_INET, optarg, &c->ip4.addr) && - !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr) && - !IN4_IS_ADDR_BROADCAST(&c->ip4.addr) && - !IN4_IS_ADDR_LOOPBACK(&c->ip4.addr) && - !IN4_IS_ADDR_MULTICAST(&c->ip4.addr)) { + case 'a': { + union inany_addr addr = { 0 };
You're unconditionally calling inany_prefix_pton(), so you shouldn't need to initialise this.
+ int prefix_len = 0;
Or this.
+ int af; + + af = inany_prefix_pton(optarg, &addr, &prefix_len);
You need to check for errors from inany_prefix_pton().
+ if (inany_is_unspecified(&addr) || + inany_is_multicast(&addr) || + inany_is_loopback(&addr) || + IN6_IS_ADDR_V4COMPAT(&addr.a6)) + die("Invalid address: %s", optarg);
For sanity / extensibility, it probably also makes to fail here if af is not either AF_INET or AF_INET6.
+ + if (prefix_len && !prefix_from_opt) + prefix_from_cidr = true; + else if (prefix_len) + die("Can't mix CIDR with -n"); + + if (af == AF_INET) { + c->ip4.addr = *inany_v4(&addr); + c->ip4.prefix_len = prefix_len ? prefix_len - 96 : + ip4_class_prefix_len(&c->ip4.addr); if (c->mode == MODE_PASTA) c->ip4.no_copy_addrs = true; - break; + } else if (af == AF_INET6) { + c->ip6.addr = addr.a6; + if (c->mode == MODE_PASTA) + c->ip6.no_copy_addrs = true; + } else { + die("Invalid prefix length: %d", prefix_len);
This error message does not seem to match the conditions that trigger it.
} - - die("Invalid address: %s", optarg); break; + } case 'n': + if (prefix_from_cidr) + die("Can't use both -n and CIDR prefix length"); c->ip4.prefix_len = conf_ip4_prefix(optarg); if (c->ip4.prefix_len < 0) die("Invalid netmask: %s", optarg); - + prefix_from_opt = true; break; case 'M': parse_mac(c->our_tap_mac, optarg); diff --git a/inany.c b/inany.c index 7680439..00d44e0 100644 --- a/inany.c +++ b/inany.c @@ -11,6 +11,7 @@ #include
#include #include +#include #include "util.h" #include "ip.h" @@ -57,3 +58,70 @@ int inany_pton(const char *src, union inany_addr *dst)
return 0; } + +/** inany_prefix_pton - Parse an IPv[46] address with prefix length adjustment
"adjustment" doesn't make sense to me here.
+ * @src: IPv[46] address string + * @dst: output buffer, filled with parsed address + * @prefix_len: pointer to prefix length + * + * Return: AF_INET for IPv4, AF_INET6 for IPv6, -1 on error
I'm not particularly convinced that returning the address family here is useful, since the caller can call inany_v4() just as easily as we can. But if we do return the address family, the it probably makes more sense to use AF_UNSPEC for errors, rather than -1, since it's not - strictly speaking - guaranteed that one of the AF_* constants has the value -1.
+ */ +int inany_prefix_pton(char *src, union inany_addr *dst, int *prefix_len)
If we are returning an address family, the return type should be sa_family_t.
+{ + bool mapped = false; + struct in6_addr a6; + struct in_addr a4; + char *slash; + char *end; + int af; + + *prefix_len = 0;
As noted below, 0 is not a suitable value for "missing prefix", because 0 length prefixes are real and important. I think it would be better to make the prefix length non-optional for this functional. The caller can fall back to inany_pton() if this fails. That makes this function more easily reusable for cases where we _require_ a prefix length.
+ + /* Check for presence of /prefix_len suffix */ + slash = strchr(src, '/'); + if (slash) + *slash = '\0'; + + /* Read address */ + if (inet_pton(AF_INET, src, &a4)) { + inany_from_af(dst, AF_INET, &a4); + af = AF_INET; + } else if (inet_pton(AF_INET6, src, &a6)) { + inany_from_af(dst, AF_INET6, &a6); + af = AF_INET6; + if (inany_v4(dst)) + mapped = true; + } else { + memset(dst, 0, sizeof(*dst)); + return -1; + } + + if (!slash) + return mapped ? AF_INET : af;
You can avoid messing around with the mapped temporary by unconditionally deriving the address family from inany_v4().
+ + /* Read prefix_len - /0 is not allowed */
/0 should be allowed. It doesn't make much sense for -a, but it's valid and important if we use this in future for routes (a 0 length prefix indicates a default route).
+ errno = 0; + *prefix_len = strtoul(slash + 1, &end, 10); + if (errno || *end || *prefix_len == 0) + return -1; + + if (mapped) { + /* IPv4-mapped: prefix already in IPv6 format, must be 96-128 */ + if (*prefix_len < 96 || *prefix_len > 128) + return -1; + return AF_INET; + } + + if (af == AF_INET) { + /* Native IPv4: convert to IPv6 format */ + if (*prefix_len > 32) + return -1; + *prefix_len += 96;
If you make this adjustment before the previous stanza you again don't need the 'mapped' local and can use inany_v4().
+ return AF_INET; + } + + /* Native IPv6: keep as-is */ + if (*prefix_len > 128) + return -1;
And if you move this before the if (mapped) stanza, you can remove the duplicated check for > 128.
+ return AF_INET6; +} diff --git a/inany.h b/inany.h index 61b36fb..316ee44 100644 --- a/inany.h +++ b/inany.h @@ -295,5 +295,6 @@ static inline void inany_siphash_feed(struct siphash_state *state,
const char *inany_ntop(const union inany_addr *src, char *dst, socklen_t size); int inany_pton(const char *src, union inany_addr *dst); +int inany_prefix_pton(char *src, union inany_addr *dst, int *prefix_len);
#endif /* INANY_H */ diff --git a/ip.c b/ip.c index 9a7f4c5..40dc24e 100644 --- a/ip.c +++ b/ip.c @@ -13,6 +13,8 @@ */
#include
+#include + #include "util.h" #include "ip.h" @@ -67,3 +69,22 @@ found: *proto = nh; return true; } + +/** + * ip4_class_prefix_len() - Get class based prefix length for IPv4 address + * @addr: IPv4 address + * + * Return: prefix length based on address class, or 32 for other + */ +int ip4_class_prefix_len(const struct in_addr *addr) +{ + in_addr_t a = ntohl(addr->s_addr); + + if (IN_CLASSA(a)) + return 32 - IN_CLASSA_NSHIFT; + if (IN_CLASSB(a)) + return 32 - IN_CLASSB_NSHIFT; + if (IN_CLASSC(a)) + return 32 - IN_CLASSC_NSHIFT; + return 32; +} diff --git a/ip.h b/ip.h index 5830b92..bd28640 100644 --- a/ip.h +++ b/ip.h @@ -135,4 +135,6 @@ static const struct in_addr in4addr_broadcast = { 0xffffffff }; #define IPV6_MIN_MTU 1280 #endif
+int ip4_class_prefix_len(const struct in_addr *addr); + #endif /* IP_H */ diff --git a/passt.1 b/passt.1 index db0d662..53537c4 100644 --- a/passt.1 +++ b/passt.1 @@ -156,10 +156,14 @@ By default, the advertised MTU is 65520 bytes, that is, the maximum 802.3 MTU minus the length of a 802.3 header, rounded to 32 bits (IPv4 words).
.TP -.BR \-a ", " \-\-address " " \fIaddr +.BR \-a ", " \-\-address " " \fIaddr\fR[/\fIprefix_len\fR] Assign IPv4 \fIaddr\fR via DHCP (\fByiaddr\fR), or \fIaddr\fR via DHCPv6 (option 5) and an \fIaddr\fR-based prefix via NDP Router Advertisement (option type 3) for an IPv6 \fIaddr\fR. +An optional /\fIprefix_len\fR (1-32 for IPv4, 1-128 for IPv6) can be +appended in CIDR notation (e.g. 192.0.2.1/24). This is an alternative to +using the \fB-n\fR, \fB--netmask\fR option. Mixing CIDR notation with +\fB-n\fR results in an error. This option can be specified zero (for defaults) to two times (once for IPv4, once for IPv6). By default, assigned IPv4 and IPv6 addresses are taken from the host interfaces @@ -172,10 +176,13 @@ is assigned for IPv4, and no additional address will be assigned for IPv6. .TP .BR \-n ", " \-\-netmask " " \fImask Assign IPv4 netmask \fImask\fR, expressed as dot-decimal or number of bits, via -DHCP (option 1). -By default, the netmask associated to the host address matching the assigned one -is used. If there's no matching address on the host, the netmask is determined -according to the CIDR block of the assigned address (RFC 4632). +DHCP (option 1). Alternatively, the prefix length can be specified using CIDR +notation with the \fB-a\fR, \fB--address\fR option (e.g. \fB-a\fR 192.0.2.1/24). +Mixing \fB-n\fR with CIDR notation results in an error. +If no address is indicated, the netmask associated with the adopted host address, +if any, is used. If an address is indicated, but without a prefix length, the +netmask is determined based on the corresponding network class. In all other +cases, the netmask is determined by using the indicated prefix length.
.TP .BR \-M ", " \-\-mac-addr " " \fIaddr -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Fri, Jan 30, 2026 at 04:44:40PM -0500, Jon Maloy wrote:
As a preparation for handling multiple addresses, we update fwd_guest_accessible4() and fwd_guest_accessible6() to check against all addresses in the unified addrs[] array using the for_each_addr() macro.
This ensures that when multiple addresses are configured via -a options, inbound traffic for any of them is correctly detected as having no valid forwarding path, and subsequently dropped. This occurs when a peer address collides with an address the guest is using, and we have no translation for it.
Signed-off-by: Jon Maloy
--- v2: Updated commit log to make it clearer v3: Adapted to changes earlier in the series --- fwd.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/fwd.c b/fwd.c index 54248a3..20d581d 100644 --- a/fwd.c +++ b/fwd.c @@ -502,6 +502,8 @@ static bool is_dns_flow(uint8_t proto, const struct flowside *ini) static bool fwd_guest_accessible4(const struct ctx *c, const struct in_addr *addr)
With the changes to this point, there's no longer a point to having separate fwd_guest_accessible4() and fwd_guest_accessible6() functions. fwd_guest_accessible() can check against the combined list in a single pass. This is an example of the simplfications that I think will outweigh the complications introduced by using merged list for v4 and v6 addresses.
{ + const struct inany_addr_entry *e; + if (IN4_IS_ADDR_LOOPBACK(addr)) return false;
@@ -513,12 +515,15 @@ static bool fwd_guest_accessible4(const struct ctx *c, if (IN4_IS_ADDR_UNSPECIFIED(addr)) return false;
- /* For IPv4, addr_seen is initialised to addr, so is always a valid - * address + /* Check against all configured guest addresses */ + for_each_addr(c, e, AF_INET) + if (IN4_ARE_ADDR_EQUAL(addr, inany_v4(&e->addr))) + return false; + + /* Also check addr_seen: it tracks the address the guest is actually + * using, which may differ from configured addresses. */ - if ((first_v4(c) && - IN4_ARE_ADDR_EQUAL(addr, inany_v4(&first_v4(c)->addr))) || - IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr_seen)) + if (IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr_seen)) return false;
return true; @@ -535,11 +540,15 @@ static bool fwd_guest_accessible4(const struct ctx *c, static bool fwd_guest_accessible6(const struct ctx *c, const struct in6_addr *addr) { + const struct inany_addr_entry *e; + if (IN6_IS_ADDR_LOOPBACK(addr)) return false;
- if (first_v6(c) && IN6_ARE_ADDR_EQUAL(addr, &first_v6(c)->addr.a6)) - return false; + /* Check against all configured guest addresses */ + for_each_addr(c, e, AF_INET6) + if (IN6_ARE_ADDR_EQUAL(addr, &e->addr.a6)) + return false;
/* For IPv6, addr_seen starts unspecified, because we don't know what LL * address the guest will take until we see it. Only check against it @@ -714,7 +723,7 @@ bool nat_inbound(const struct ctx *c, const union inany_addr *addr, first_v4(c) && inany_equals(addr, &first_v4(c)->addr)) { *translated = inany_from_v4(c->ip4.map_guest_addr); } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) && - first_v6(c) && inany_equals6(addr, &first_v6(c)->addr.a6)) { + first_v6(c) && inany_equals(addr, &first_v6(c)->addr)) { translated->a6 = c->ip6.map_guest_addr; } else if (fwd_guest_accessible(c, addr)) { *translated = *addr; -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Fri, Jan 30, 2026 at 04:44:38PM -0500, Jon Maloy wrote:
Add a convenience macro to compute an IPv4 netmask from a prefix length. This simplifies netmask calculations throughout the codebase.
Signed-off-by: Jon Maloy
Reviewed-by: David Gibson
--- conf.c | 2 +- dhcp.c | 2 +- ip.h | 2 ++ 3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/conf.c b/conf.c index 98d5d17..5188c02 100644 --- a/conf.c +++ b/conf.c @@ -1147,7 +1147,7 @@ static void conf_print(const struct ctx *c) if (!c->no_dhcp) { uint32_t mask;
- mask = htonl(0xffffffff << (32 - c->ip4.prefix_len)); + mask = IN4_MASK(c->ip4.prefix_len);
info("DHCP:"); info(" assign: %s", diff --git a/dhcp.c b/dhcp.c index 6b9c2e3..c552f01 100644 --- a/dhcp.c +++ b/dhcp.c @@ -404,7 +404,7 @@ int dhcp(const struct ctx *c, struct iov_tail *data)
info(" from %s", eth_ntop(m->chaddr, macstr, sizeof(macstr)));
- mask.s_addr = htonl(0xffffffff << (32 - c->ip4.prefix_len)); + mask.s_addr = IN4_MASK(c->ip4.prefix_len); memcpy(opts[1].s, &mask, sizeof(mask)); memcpy(opts[3].s, &c->ip4.guest_gw, sizeof(c->ip4.guest_gw)); memcpy(opts[54].s, &c->ip4.our_tap_addr, sizeof(c->ip4.our_tap_addr)); diff --git a/ip.h b/ip.h index bd28640..c829d84 100644 --- a/ip.h +++ b/ip.h @@ -17,6 +17,8 @@ (ntohl(((struct in_addr *)(a))->s_addr) >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET) #define IN4_IS_ADDR_MULTICAST(a) \ (IN_MULTICAST(ntohl(((struct in_addr *)(a))->s_addr))) +#define IN4_MASK(prefix_len) \ + (htonl(0xffffffff << (128 - (prefix_len)))) #define IN4_ARE_ADDR_EQUAL(a, b) \ (((struct in_addr *)(a))->s_addr == ((struct in_addr *)b)->s_addr) #define IN4ADDR_LOOPBACK_INIT \ -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On 2026-02-04 07:50, David Gibson wrote:
On Fri, Jan 30, 2026 at 04:44:37PM -0500, Jon Maloy wrote:
Extend the -a/--address option to accept addresses in CIDR notation (e.g., 192.168.1.1/24 or 2001:db8::1/64) as an alternative to using separate -a and -n options.
We add a new inany_prefix_pton() helper function that: - Parses address strings with optional /prefix_len suffix - Validates prefix length based on address family (0-32 for IPv4, 0-128 for IPv6), including handling of IPv4-to-IPv6 mapping case. - Returns identified address family, if any.
For IPv4, the prefix length is stored in ip4.prefix_len when provided. Mixing -n and CIDR notation results in an error to catch likely user mistakes.
Also fix a bug in conf_ip4_prefix() that was incorrectly using the global 'optarg' instead of its 'arg' parameter.
Signed-off-by: Jon Maloy
--- v3: Fixes after feedback from Laurent, David and Stefano Notably, updated man page for the -a option
v4: Fixes based on feedback from David G: - Handling prefix length adjustment when IPv4-to-IPv6 mapping - Removed redundant !IN6_IS_ADDR_V4MAPPED(&addr.a6) test - Simplified tests of acceptable address types - Merged documentation and code commits - Some documentation text clarifications
v5: - Moved address/prefix parsing into a refactored inany_prefix_pton() function. - inany_prefix_pton() now only caluclates IPv6 style prefix lengths - Stricter distinction between error causes. - Some refactoring of the 'case a:' branch in conf() - Some small fixes in passt.1 --- conf.c | 58 +++++++++++++++++++++++++++++------------------- inany.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ inany.h | 1 + ip.c | 21 ++++++++++++++++++ ip.h | 2 ++ passt.1 | 17 ++++++++++----- 6 files changed, 139 insertions(+), 28 deletions(-)
diff --git a/conf.c b/conf.c index 2942c8c..98d5d17 100644 --- a/conf.c +++ b/conf.c @@ -682,7 +682,7 @@ static int conf_ip4_prefix(const char *arg) return -1; } else { errno = 0; - len = strtoul(optarg, NULL, 0); + len = strtoul(arg, NULL, 0); if (len > 32 || errno) return -1; } @@ -896,7 +896,7 @@ static void usage(const char *name, FILE *f, int status) " a zero value disables assignment\n" " default: 65520: maximum 802.3 MTU minus 802.3 header\n" " length, rounded to 32 bits (IPv4 words)\n" - " -a, --address ADDR Assign IPv4 or IPv6 address ADDR\n" + " -a, --address ADDR Assign IPv4 or IPv6 address ADDR[/PREFIXLEN]\n" " can be specified zero to two times (for IPv4 and IPv6)\n" " default: use addresses from interface with default route\n" " -n, --netmask MASK Assign IPv4 MASK, dot-decimal or bits\n" @@ -1498,6 +1498,7 @@ void conf(struct ctx *c, int argc, char **argv) const char *optstring = "+dqfel:hs:F:I:p:P:m:a:n:M:g:i:o:D:S:H:461t:u:T:U:"; const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt"; char userns[PATH_MAX] = { 0 }, netns[PATH_MAX] = { 0 }; + bool prefix_from_cidr = false, prefix_from_opt = false; bool copy_addrs_opt = false, copy_routes_opt = false; enum fwd_ports_mode fwd_default = FWD_NONE; bool v4_only = false, v6_only = false; @@ -1808,35 +1809,46 @@ void conf(struct ctx *c, int argc, char **argv) c->mtu = mtu; break; } - case 'a': - if (inet_pton(AF_INET6, optarg, &c->ip6.addr) && - !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr) && - !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr) && - !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr) && - !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr) && - !IN6_IS_ADDR_MULTICAST(&c->ip6.addr)) { - if (c->mode == MODE_PASTA) - c->ip6.no_copy_addrs = true; - break; - } - - if (inet_pton(AF_INET, optarg, &c->ip4.addr) && - !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr) && - !IN4_IS_ADDR_BROADCAST(&c->ip4.addr) && - !IN4_IS_ADDR_LOOPBACK(&c->ip4.addr) && - !IN4_IS_ADDR_MULTICAST(&c->ip4.addr)) { + case 'a': { + union inany_addr addr = { 0 };
You're unconditionally calling inany_prefix_pton(), so you shouldn't need to initialise this.
Yes, to guarantee I get an invalid address in case the parsing fails. However, I should do it inside the funtion instead.
+ int prefix_len = 0;
Or this.
Same.
+ int af; + + af = inany_prefix_pton(optarg, &addr, &prefix_len);
You need to check for errors from inany_prefix_pton().
I do that. Further down.
+ if (inany_is_unspecified(&addr) || + inany_is_multicast(&addr) || + inany_is_loopback(&addr) || + IN6_IS_ADDR_V4COMPAT(&addr.a6)) + die("Invalid address: %s", optarg);
For sanity / extensibility, it probably also makes to fail here if af is not either AF_INET or AF_INET6.
This is what I do further down.
+ + if (prefix_len && !prefix_from_opt) + prefix_from_cidr = true; + else if (prefix_len) + die("Can't mix CIDR with -n"); + + if (af == AF_INET) { + c->ip4.addr = *inany_v4(&addr); + c->ip4.prefix_len = prefix_len ? prefix_len - 96 : + ip4_class_prefix_len(&c->ip4.addr); if (c->mode == MODE_PASTA) c->ip4.no_copy_addrs = true; - break; + } else if (af == AF_INET6) { + c->ip6.addr = addr.a6; + if (c->mode == MODE_PASTA) + c->ip6.no_copy_addrs = true; + } else { + die("Invalid prefix length: %d", prefix_len);
This error message does not seem to match the conditions that trigger it.
The only way you can obtain a valid address and a return value different from AF_INET and AF_INET6 is that the prefix_len parsing failed or the value was invalid. All the above is logically correct, although we can discuss if it is the cleanest and best algorithm. The fact that you misunderstand this is a sign it is not.
} - - die("Invalid address: %s", optarg); break; + } case 'n': + if (prefix_from_cidr) + die("Can't use both -n and CIDR prefix length"); c->ip4.prefix_len = conf_ip4_prefix(optarg); if (c->ip4.prefix_len < 0) die("Invalid netmask: %s", optarg); - + prefix_from_opt = true; break; case 'M': parse_mac(c->our_tap_mac, optarg); diff --git a/inany.c b/inany.c index 7680439..00d44e0 100644 --- a/inany.c +++ b/inany.c @@ -11,6 +11,7 @@ #include
#include #include +#include #include "util.h" #include "ip.h" @@ -57,3 +58,70 @@ int inany_pton(const char *src, union inany_addr *dst)
return 0; } + +/** inany_prefix_pton - Parse an IPv[46] address with prefix length adjustment
"adjustment" doesn't make sense to me here.
+ * @src: IPv[46] address string + * @dst: output buffer, filled with parsed address + * @prefix_len: pointer to prefix length + * + * Return: AF_INET for IPv4, AF_INET6 for IPv6, -1 on error
I'm not particularly convinced that returning the address family here is useful, since the caller can call inany_v4() just as easily as we can. But if we do return the address family, the it probably makes more sense to use AF_UNSPEC for errors, rather than -1, since it's not - strictly speaking - guaranteed that one of the AF_* constants has the value -1.
AF_UNSPEC is a valid code, meaning "any" or "don't care". That doesn't sound like a suitable error code, and we do want to know if the parsing failed. I think I'll go for just a 0/-1 return value instead.
+ */ +int inany_prefix_pton(char *src, union inany_addr *dst, int *prefix_len)
If we are returning an address family, the return type should be sa_family_t.
Ok.
+{ + bool mapped = false; + struct in6_addr a6; + struct in_addr a4; + char *slash; + char *end; + int af; + + *prefix_len = 0;
As noted below, 0 is not a suitable value for "missing prefix", because 0 length prefixes are real and important.
0 is a valid value, but the question is if /0 ia a valid suffix in our CIDR format. I see below that you think so.
I think it would be better to make the prefix length non-optional for this functional. The caller can fall back to inany_pton() if this fails. That makes this function more easily reusable for cases where we _require_ a prefix length.
I can try that.
+ + /* Check for presence of /prefix_len suffix */ + slash = strchr(src, '/'); + if (slash) + *slash = '\0'; + + /* Read address */ + if (inet_pton(AF_INET, src, &a4)) { + inany_from_af(dst, AF_INET, &a4); + af = AF_INET; + } else if (inet_pton(AF_INET6, src, &a6)) { + inany_from_af(dst, AF_INET6, &a6); + af = AF_INET6; + if (inany_v4(dst)) + mapped = true; + } else { + memset(dst, 0, sizeof(*dst)); + return -1; + } + + if (!slash) + return mapped ? AF_INET : af;
You can avoid messing around with the mapped temporary by unconditionally deriving the address family from inany_v4().
Ok.
+ + /* Read prefix_len - /0 is not allowed */
/0 should be allowed. It doesn't make much sense for -a, but it's valid and important if we use this in future for routes (a 0 length prefix indicates a default route).
ok. That changes a few things.
+ errno = 0; + *prefix_len = strtoul(slash + 1, &end, 10); + if (errno || *end || *prefix_len == 0) + return -1; + + if (mapped) { + /* IPv4-mapped: prefix already in IPv6 format, must be 96-128 */ + if (*prefix_len < 96 || *prefix_len > 128) + return -1; + return AF_INET; + } + + if (af == AF_INET) { + /* Native IPv4: convert to IPv6 format */ + if (*prefix_len > 32) + return -1; + *prefix_len += 96;
If you make this adjustment before the previous stanza you again don't need the 'mapped' local and can use inany_v4().
Good point.
+ return AF_INET; + } + + /* Native IPv6: keep as-is */ + if (*prefix_len > 128) + return -1;
And if you move this before the if (mapped) stanza, you can remove the duplicated check for > 128.
I'll give it a try. ///jon
+ return AF_INET6; +} diff --git a/inany.h b/inany.h index 61b36fb..316ee44 100644 --- a/inany.h +++ b/inany.h @@ -295,5 +295,6 @@ static inline void inany_siphash_feed(struct siphash_state *state,
const char *inany_ntop(const union inany_addr *src, char *dst, socklen_t size); int inany_pton(const char *src, union inany_addr *dst); +int inany_prefix_pton(char *src, union inany_addr *dst, int *prefix_len);
#endif /* INANY_H */ diff --git a/ip.c b/ip.c index 9a7f4c5..40dc24e 100644 --- a/ip.c +++ b/ip.c @@ -13,6 +13,8 @@ */
#include
+#include + #include "util.h" #include "ip.h" @@ -67,3 +69,22 @@ found: *proto = nh; return true; } + +/** + * ip4_class_prefix_len() - Get class based prefix length for IPv4 address + * @addr: IPv4 address + * + * Return: prefix length based on address class, or 32 for other + */ +int ip4_class_prefix_len(const struct in_addr *addr) +{ + in_addr_t a = ntohl(addr->s_addr); + + if (IN_CLASSA(a)) + return 32 - IN_CLASSA_NSHIFT; + if (IN_CLASSB(a)) + return 32 - IN_CLASSB_NSHIFT; + if (IN_CLASSC(a)) + return 32 - IN_CLASSC_NSHIFT; + return 32; +} diff --git a/ip.h b/ip.h index 5830b92..bd28640 100644 --- a/ip.h +++ b/ip.h @@ -135,4 +135,6 @@ static const struct in_addr in4addr_broadcast = { 0xffffffff }; #define IPV6_MIN_MTU 1280 #endif
+int ip4_class_prefix_len(const struct in_addr *addr); + #endif /* IP_H */ diff --git a/passt.1 b/passt.1 index db0d662..53537c4 100644 --- a/passt.1 +++ b/passt.1 @@ -156,10 +156,14 @@ By default, the advertised MTU is 65520 bytes, that is, the maximum 802.3 MTU minus the length of a 802.3 header, rounded to 32 bits (IPv4 words).
.TP -.BR \-a ", " \-\-address " " \fIaddr +.BR \-a ", " \-\-address " " \fIaddr\fR[/\fIprefix_len\fR] Assign IPv4 \fIaddr\fR via DHCP (\fByiaddr\fR), or \fIaddr\fR via DHCPv6 (option 5) and an \fIaddr\fR-based prefix via NDP Router Advertisement (option type 3) for an IPv6 \fIaddr\fR. +An optional /\fIprefix_len\fR (1-32 for IPv4, 1-128 for IPv6) can be +appended in CIDR notation (e.g. 192.0.2.1/24). This is an alternative to +using the \fB-n\fR, \fB--netmask\fR option. Mixing CIDR notation with +\fB-n\fR results in an error. This option can be specified zero (for defaults) to two times (once for IPv4, once for IPv6). By default, assigned IPv4 and IPv6 addresses are taken from the host interfaces @@ -172,10 +176,13 @@ is assigned for IPv4, and no additional address will be assigned for IPv6. .TP .BR \-n ", " \-\-netmask " " \fImask Assign IPv4 netmask \fImask\fR, expressed as dot-decimal or number of bits, via -DHCP (option 1). -By default, the netmask associated to the host address matching the assigned one -is used. If there's no matching address on the host, the netmask is determined -according to the CIDR block of the assigned address (RFC 4632). +DHCP (option 1). Alternatively, the prefix length can be specified using CIDR +notation with the \fB-a\fR, \fB--address\fR option (e.g. \fB-a\fR 192.0.2.1/24). +Mixing \fB-n\fR with CIDR notation results in an error. +If no address is indicated, the netmask associated with the adopted host address, +if any, is used. If an address is indicated, but without a prefix length, the +netmask is determined based on the corresponding network class. In all other +cases, the netmask is determined by using the indicated prefix length.
.TP .BR \-M ", " \-\-mac-addr " " \fIaddr -- 2.52.0
On 2026-02-04 08:16, David Gibson wrote:
On Fri, Jan 30, 2026 at 04:44:40PM -0500, Jon Maloy wrote:
As a preparation for handling multiple addresses, we update fwd_guest_accessible4() and fwd_guest_accessible6() to check against all addresses in the unified addrs[] array using the for_each_addr() macro.
This ensures that when multiple addresses are configured via -a options, inbound traffic for any of them is correctly detected as having no valid forwarding path, and subsequently dropped. This occurs when a peer address collides with an address the guest is using, and we have no translation for it.
Signed-off-by: Jon Maloy
--- v2: Updated commit log to make it clearer v3: Adapted to changes earlier in the series --- fwd.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/fwd.c b/fwd.c index 54248a3..20d581d 100644 --- a/fwd.c +++ b/fwd.c @@ -502,6 +502,8 @@ static bool is_dns_flow(uint8_t proto, const struct flowside *ini) static bool fwd_guest_accessible4(const struct ctx *c, const struct in_addr *addr)
With the changes to this point, there's no longer a point to having separate fwd_guest_accessible4() and fwd_guest_accessible6() functions. fwd_guest_accessible() can check against the combined list in a single pass.
Absolutely. I have already identified several functions which can be unified in a similar way, and have already implemented some. But I still want to wait with such changes until this series has been applied. ///jon
This is an example of the simplfications that I think will outweigh the complications introduced by using merged list for v4 and v6 addresses.
{ + const struct inany_addr_entry *e; + if (IN4_IS_ADDR_LOOPBACK(addr)) return false;
@@ -513,12 +515,15 @@ static bool fwd_guest_accessible4(const struct ctx *c, if (IN4_IS_ADDR_UNSPECIFIED(addr)) return false;
- /* For IPv4, addr_seen is initialised to addr, so is always a valid - * address + /* Check against all configured guest addresses */ + for_each_addr(c, e, AF_INET) + if (IN4_ARE_ADDR_EQUAL(addr, inany_v4(&e->addr))) + return false; + + /* Also check addr_seen: it tracks the address the guest is actually + * using, which may differ from configured addresses. */ - if ((first_v4(c) && - IN4_ARE_ADDR_EQUAL(addr, inany_v4(&first_v4(c)->addr))) || - IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr_seen)) + if (IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr_seen)) return false;
return true; @@ -535,11 +540,15 @@ static bool fwd_guest_accessible4(const struct ctx *c, static bool fwd_guest_accessible6(const struct ctx *c, const struct in6_addr *addr) { + const struct inany_addr_entry *e; + if (IN6_IS_ADDR_LOOPBACK(addr)) return false;
- if (first_v6(c) && IN6_ARE_ADDR_EQUAL(addr, &first_v6(c)->addr.a6)) - return false; + /* Check against all configured guest addresses */ + for_each_addr(c, e, AF_INET6) + if (IN6_ARE_ADDR_EQUAL(addr, &e->addr.a6)) + return false;
/* For IPv6, addr_seen starts unspecified, because we don't know what LL * address the guest will take until we see it. Only check against it @@ -714,7 +723,7 @@ bool nat_inbound(const struct ctx *c, const union inany_addr *addr, first_v4(c) && inany_equals(addr, &first_v4(c)->addr)) { *translated = inany_from_v4(c->ip4.map_guest_addr); } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) && - first_v6(c) && inany_equals6(addr, &first_v6(c)->addr.a6)) { + first_v6(c) && inany_equals(addr, &first_v6(c)->addr)) { translated->a6 = c->ip6.map_guest_addr; } else if (fwd_guest_accessible(c, addr)) { *translated = *addr; -- 2.52.0
On Wed, Feb 04, 2026 at 07:56:35PM -0500, Jon Maloy wrote:
On 2026-02-04 07:50, David Gibson wrote:
On Fri, Jan 30, 2026 at 04:44:37PM -0500, Jon Maloy wrote:
Extend the -a/--address option to accept addresses in CIDR notation (e.g., 192.168.1.1/24 or 2001:db8::1/64) as an alternative to using separate -a and -n options.
We add a new inany_prefix_pton() helper function that: - Parses address strings with optional /prefix_len suffix - Validates prefix length based on address family (0-32 for IPv4, 0-128 for IPv6), including handling of IPv4-to-IPv6 mapping case. - Returns identified address family, if any.
For IPv4, the prefix length is stored in ip4.prefix_len when provided. Mixing -n and CIDR notation results in an error to catch likely user mistakes.
Also fix a bug in conf_ip4_prefix() that was incorrectly using the global 'optarg' instead of its 'arg' parameter.
Signed-off-by: Jon Maloy
--- v3: Fixes after feedback from Laurent, David and Stefano Notably, updated man page for the -a option
v4: Fixes based on feedback from David G: - Handling prefix length adjustment when IPv4-to-IPv6 mapping - Removed redundant !IN6_IS_ADDR_V4MAPPED(&addr.a6) test - Simplified tests of acceptable address types - Merged documentation and code commits - Some documentation text clarifications
v5: - Moved address/prefix parsing into a refactored inany_prefix_pton() function. - inany_prefix_pton() now only caluclates IPv6 style prefix lengths - Stricter distinction between error causes. - Some refactoring of the 'case a:' branch in conf() - Some small fixes in passt.1 --- conf.c | 58 +++++++++++++++++++++++++++++------------------- inany.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ inany.h | 1 + ip.c | 21 ++++++++++++++++++ ip.h | 2 ++ passt.1 | 17 ++++++++++----- 6 files changed, 139 insertions(+), 28 deletions(-)
diff --git a/conf.c b/conf.c index 2942c8c..98d5d17 100644 --- a/conf.c +++ b/conf.c @@ -682,7 +682,7 @@ static int conf_ip4_prefix(const char *arg) return -1; } else { errno = 0; - len = strtoul(optarg, NULL, 0); + len = strtoul(arg, NULL, 0); if (len > 32 || errno) return -1; } @@ -896,7 +896,7 @@ static void usage(const char *name, FILE *f, int status) " a zero value disables assignment\n" " default: 65520: maximum 802.3 MTU minus 802.3 header\n" " length, rounded to 32 bits (IPv4 words)\n" - " -a, --address ADDR Assign IPv4 or IPv6 address ADDR\n" + " -a, --address ADDR Assign IPv4 or IPv6 address ADDR[/PREFIXLEN]\n" " can be specified zero to two times (for IPv4 and IPv6)\n" " default: use addresses from interface with default route\n" " -n, --netmask MASK Assign IPv4 MASK, dot-decimal or bits\n" @@ -1498,6 +1498,7 @@ void conf(struct ctx *c, int argc, char **argv) const char *optstring = "+dqfel:hs:F:I:p:P:m:a:n:M:g:i:o:D:S:H:461t:u:T:U:"; const char *logname = (c->mode == MODE_PASTA) ? "pasta" : "passt"; char userns[PATH_MAX] = { 0 }, netns[PATH_MAX] = { 0 }; + bool prefix_from_cidr = false, prefix_from_opt = false; bool copy_addrs_opt = false, copy_routes_opt = false; enum fwd_ports_mode fwd_default = FWD_NONE; bool v4_only = false, v6_only = false; @@ -1808,35 +1809,46 @@ void conf(struct ctx *c, int argc, char **argv) c->mtu = mtu; break; } - case 'a': - if (inet_pton(AF_INET6, optarg, &c->ip6.addr) && - !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr) && - !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr) && - !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr) && - !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr) && - !IN6_IS_ADDR_MULTICAST(&c->ip6.addr)) { - if (c->mode == MODE_PASTA) - c->ip6.no_copy_addrs = true; - break; - } - - if (inet_pton(AF_INET, optarg, &c->ip4.addr) && - !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr) && - !IN4_IS_ADDR_BROADCAST(&c->ip4.addr) && - !IN4_IS_ADDR_LOOPBACK(&c->ip4.addr) && - !IN4_IS_ADDR_MULTICAST(&c->ip4.addr)) { + case 'a': { + union inany_addr addr = { 0 };
You're unconditionally calling inany_prefix_pton(), so you shouldn't need to initialise this.
Yes, to guarantee I get an invalid address in case the parsing fails. However, I should do it inside the funtion instead.
Right.
+ int prefix_len = 0;
Or this.
Same.
+ int af; + + af = inany_prefix_pton(optarg, &addr, &prefix_len);
You need to check for errors from inany_prefix_pton().
I do that. Further down.
Ok, but it's hard to follow when separated from the call like that.
+ if (inany_is_unspecified(&addr) || + inany_is_multicast(&addr) || + inany_is_loopback(&addr) || + IN6_IS_ADDR_V4COMPAT(&addr.a6)) + die("Invalid address: %s", optarg);
For sanity / extensibility, it probably also makes to fail here if af is not either AF_INET or AF_INET6.
This is what I do further down.
Ah, true.
+ + if (prefix_len && !prefix_from_opt) + prefix_from_cidr = true; + else if (prefix_len) + die("Can't mix CIDR with -n"); + + if (af == AF_INET) { + c->ip4.addr = *inany_v4(&addr); + c->ip4.prefix_len = prefix_len ? prefix_len - 96 : + ip4_class_prefix_len(&c->ip4.addr); if (c->mode == MODE_PASTA) c->ip4.no_copy_addrs = true; - break; + } else if (af == AF_INET6) { + c->ip6.addr = addr.a6; + if (c->mode == MODE_PASTA) + c->ip6.no_copy_addrs = true; + } else { + die("Invalid prefix length: %d", prefix_len);
This error message does not seem to match the conditions that trigger it.
The only way you can obtain a valid address and a return value different from AF_INET and AF_INET6 is that the prefix_len parsing failed or the value was invalid.
Right, but it's a minor layering violation to assume that in the caller.
All the above is logically correct, although we can discuss if it is the cleanest and best algorithm. The fact that you misunderstand this is a sign it is not.
} - - die("Invalid address: %s", optarg); break; + } case 'n': + if (prefix_from_cidr) + die("Can't use both -n and CIDR prefix length"); c->ip4.prefix_len = conf_ip4_prefix(optarg); if (c->ip4.prefix_len < 0) die("Invalid netmask: %s", optarg); - + prefix_from_opt = true; break; case 'M': parse_mac(c->our_tap_mac, optarg); diff --git a/inany.c b/inany.c index 7680439..00d44e0 100644 --- a/inany.c +++ b/inany.c @@ -11,6 +11,7 @@ #include
#include #include +#include #include "util.h" #include "ip.h" @@ -57,3 +58,70 @@ int inany_pton(const char *src, union inany_addr *dst) return 0; } + +/** inany_prefix_pton - Parse an IPv[46] address with prefix length adjustment "adjustment" doesn't make sense to me here.
+ * @src: IPv[46] address string + * @dst: output buffer, filled with parsed address + * @prefix_len: pointer to prefix length + * + * Return: AF_INET for IPv4, AF_INET6 for IPv6, -1 on error
I'm not particularly convinced that returning the address family here is useful, since the caller can call inany_v4() just as easily as we can. But if we do return the address family, the it probably makes more sense to use AF_UNSPEC for errors, rather than -1, since it's not - strictly speaking - guaranteed that one of the AF_* constants has the value -1.
AF_UNSPEC is a valid code, meaning "any" or "don't care". That doesn't sound like a suitable error code, and we do want to know if the parsing failed. I think I'll go for just a 0/-1 return value instead.
Ok.
+ */ +int inany_prefix_pton(char *src, union inany_addr *dst, int *prefix_len)
If we are returning an address family, the return type should be sa_family_t. Ok.
+{ + bool mapped = false; + struct in6_addr a6; + struct in_addr a4; + char *slash; + char *end; + int af; + + *prefix_len = 0;
As noted below, 0 is not a suitable value for "missing prefix", because 0 length prefixes are real and important.
0 is a valid value, but the question is if /0 ia a valid suffix in our CIDR format. I see below that you think so.
For -a, maybe not. But we want to be able to re-use this for things where it will be.
I think it would be better to make the prefix length non-optional for this functional. The caller can fall back to inany_pton() if this fails. That makes this function more easily reusable for cases where we _require_ a prefix length.
I can try that.
+ + /* Check for presence of /prefix_len suffix */ + slash = strchr(src, '/'); + if (slash) + *slash = '\0'; + + /* Read address */ + if (inet_pton(AF_INET, src, &a4)) { + inany_from_af(dst, AF_INET, &a4); + af = AF_INET; + } else if (inet_pton(AF_INET6, src, &a6)) { + inany_from_af(dst, AF_INET6, &a6); + af = AF_INET6; + if (inany_v4(dst)) + mapped = true; + } else { + memset(dst, 0, sizeof(*dst)); + return -1; + } + + if (!slash) + return mapped ? AF_INET : af;
You can avoid messing around with the mapped temporary by unconditionally deriving the address family from inany_v4().
Ok.
+ + /* Read prefix_len - /0 is not allowed */
/0 should be allowed. It doesn't make much sense for -a, but it's valid and important if we use this in future for routes (a 0 length prefix indicates a default route).
ok. That changes a few things.
+ errno = 0; + *prefix_len = strtoul(slash + 1, &end, 10); + if (errno || *end || *prefix_len == 0) + return -1; + + if (mapped) { + /* IPv4-mapped: prefix already in IPv6 format, must be 96-128 */ + if (*prefix_len < 96 || *prefix_len > 128) + return -1; + return AF_INET; + } + + if (af == AF_INET) { + /* Native IPv4: convert to IPv6 format */ + if (*prefix_len > 32) + return -1; + *prefix_len += 96;
If you make this adjustment before the previous stanza you again don't need the 'mapped' local and can use inany_v4().
Good point.
+ return AF_INET; + } + + /* Native IPv6: keep as-is */ + if (*prefix_len > 128) + return -1;
And if you move this before the if (mapped) stanza, you can remove the duplicated check for > 128.
I'll give it a try.
///jon
+ return AF_INET6; +} diff --git a/inany.h b/inany.h index 61b36fb..316ee44 100644 --- a/inany.h +++ b/inany.h @@ -295,5 +295,6 @@ static inline void inany_siphash_feed(struct siphash_state *state, const char *inany_ntop(const union inany_addr *src, char *dst, socklen_t size); int inany_pton(const char *src, union inany_addr *dst); +int inany_prefix_pton(char *src, union inany_addr *dst, int *prefix_len); #endif /* INANY_H */ diff --git a/ip.c b/ip.c index 9a7f4c5..40dc24e 100644 --- a/ip.c +++ b/ip.c @@ -13,6 +13,8 @@ */ #include
+#include + #include "util.h" #include "ip.h" @@ -67,3 +69,22 @@ found: *proto = nh; return true; } + +/** + * ip4_class_prefix_len() - Get class based prefix length for IPv4 address + * @addr: IPv4 address + * + * Return: prefix length based on address class, or 32 for other + */ +int ip4_class_prefix_len(const struct in_addr *addr) +{ + in_addr_t a = ntohl(addr->s_addr); + + if (IN_CLASSA(a)) + return 32 - IN_CLASSA_NSHIFT; + if (IN_CLASSB(a)) + return 32 - IN_CLASSB_NSHIFT; + if (IN_CLASSC(a)) + return 32 - IN_CLASSC_NSHIFT; + return 32; +} diff --git a/ip.h b/ip.h index 5830b92..bd28640 100644 --- a/ip.h +++ b/ip.h @@ -135,4 +135,6 @@ static const struct in_addr in4addr_broadcast = { 0xffffffff }; #define IPV6_MIN_MTU 1280 #endif +int ip4_class_prefix_len(const struct in_addr *addr); + #endif /* IP_H */ diff --git a/passt.1 b/passt.1 index db0d662..53537c4 100644 --- a/passt.1 +++ b/passt.1 @@ -156,10 +156,14 @@ By default, the advertised MTU is 65520 bytes, that is, the maximum 802.3 MTU minus the length of a 802.3 header, rounded to 32 bits (IPv4 words). .TP -.BR \-a ", " \-\-address " " \fIaddr +.BR \-a ", " \-\-address " " \fIaddr\fR[/\fIprefix_len\fR] Assign IPv4 \fIaddr\fR via DHCP (\fByiaddr\fR), or \fIaddr\fR via DHCPv6 (option 5) and an \fIaddr\fR-based prefix via NDP Router Advertisement (option type 3) for an IPv6 \fIaddr\fR. +An optional /\fIprefix_len\fR (1-32 for IPv4, 1-128 for IPv6) can be +appended in CIDR notation (e.g. 192.0.2.1/24). This is an alternative to +using the \fB-n\fR, \fB--netmask\fR option. Mixing CIDR notation with +\fB-n\fR results in an error. This option can be specified zero (for defaults) to two times (once for IPv4, once for IPv6). By default, assigned IPv4 and IPv6 addresses are taken from the host interfaces @@ -172,10 +176,13 @@ is assigned for IPv4, and no additional address will be assigned for IPv6. .TP .BR \-n ", " \-\-netmask " " \fImask Assign IPv4 netmask \fImask\fR, expressed as dot-decimal or number of bits, via -DHCP (option 1). -By default, the netmask associated to the host address matching the assigned one -is used. If there's no matching address on the host, the netmask is determined -according to the CIDR block of the assigned address (RFC 4632). +DHCP (option 1). Alternatively, the prefix length can be specified using CIDR +notation with the \fB-a\fR, \fB--address\fR option (e.g. \fB-a\fR 192.0.2.1/24). +Mixing \fB-n\fR with CIDR notation results in an error. +If no address is indicated, the netmask associated with the adopted host address, +if any, is used. If an address is indicated, but without a prefix length, the +netmask is determined based on the corresponding network class. In all other +cases, the netmask is determined by using the indicated prefix length. .TP .BR \-M ", " \-\-mac-addr " " \fIaddr -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, Feb 04, 2026 at 08:01:45PM -0500, Jon Maloy wrote:
On 2026-02-04 08:16, David Gibson wrote:
On Fri, Jan 30, 2026 at 04:44:40PM -0500, Jon Maloy wrote:
As a preparation for handling multiple addresses, we update fwd_guest_accessible4() and fwd_guest_accessible6() to check against all addresses in the unified addrs[] array using the for_each_addr() macro.
This ensures that when multiple addresses are configured via -a options, inbound traffic for any of them is correctly detected as having no valid forwarding path, and subsequently dropped. This occurs when a peer address collides with an address the guest is using, and we have no translation for it.
Signed-off-by: Jon Maloy
--- v2: Updated commit log to make it clearer v3: Adapted to changes earlier in the series --- fwd.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/fwd.c b/fwd.c index 54248a3..20d581d 100644 --- a/fwd.c +++ b/fwd.c @@ -502,6 +502,8 @@ static bool is_dns_flow(uint8_t proto, const struct flowside *ini) static bool fwd_guest_accessible4(const struct ctx *c, const struct in_addr *addr)
With the changes to this point, there's no longer a point to having separate fwd_guest_accessible4() and fwd_guest_accessible6() functions. fwd_guest_accessible() can check against the combined list in a single pass.
Absolutely. I have already identified several functions which can be unified in a similar way, and have already implemented some. But I still want to wait with such changes until this series has been applied.
That doesn't make a lot of sense to me. I mean, I understand doing things one step at a time. But in this case keeping the old structure means a bunch of extra churn, and the intermediate steps are more confusing to read and review than they need to be.
This is an example of the simplfications that I think will outweigh the complications introduced by using merged list for v4 and v6 addresses.
{ + const struct inany_addr_entry *e; + if (IN4_IS_ADDR_LOOPBACK(addr)) return false; @@ -513,12 +515,15 @@ static bool fwd_guest_accessible4(const struct ctx *c, if (IN4_IS_ADDR_UNSPECIFIED(addr)) return false; - /* For IPv4, addr_seen is initialised to addr, so is always a valid - * address + /* Check against all configured guest addresses */ + for_each_addr(c, e, AF_INET) + if (IN4_ARE_ADDR_EQUAL(addr, inany_v4(&e->addr))) + return false; + + /* Also check addr_seen: it tracks the address the guest is actually + * using, which may differ from configured addresses. */ - if ((first_v4(c) && - IN4_ARE_ADDR_EQUAL(addr, inany_v4(&first_v4(c)->addr))) || - IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr_seen)) + if (IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr_seen)) return false; return true; @@ -535,11 +540,15 @@ static bool fwd_guest_accessible4(const struct ctx *c, static bool fwd_guest_accessible6(const struct ctx *c, const struct in6_addr *addr) { + const struct inany_addr_entry *e; + if (IN6_IS_ADDR_LOOPBACK(addr)) return false; - if (first_v6(c) && IN6_ARE_ADDR_EQUAL(addr, &first_v6(c)->addr.a6)) - return false; + /* Check against all configured guest addresses */ + for_each_addr(c, e, AF_INET6) + if (IN6_ARE_ADDR_EQUAL(addr, &e->addr.a6)) + return false; /* For IPv6, addr_seen starts unspecified, because we don't know what LL * address the guest will take until we see it. Only check against it @@ -714,7 +723,7 @@ bool nat_inbound(const struct ctx *c, const union inany_addr *addr, first_v4(c) && inany_equals(addr, &first_v4(c)->addr)) { *translated = inany_from_v4(c->ip4.map_guest_addr); } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) && - first_v6(c) && inany_equals6(addr, &first_v6(c)->addr.a6)) { + first_v6(c) && inany_equals(addr, &first_v6(c)->addr)) { translated->a6 = c->ip6.map_guest_addr; } else if (fwd_guest_accessible(c, addr)) { *translated = *addr; -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Fri, Jan 30, 2026 at 04:44:39PM -0500, Jon Maloy wrote:
As preparation for supporting multiple addresses per interface, we replace the single addr/prefix_len fields with an array. The array consists of a new struct inany_addr_entry containing an address and prefix length, both in inany_addr format.
There are only two functional changes: - The indicated IPv6 prefix length is now properly stored, instead of being ignored and overridden with the hardcoded value 64, as as has been the case until now. - Since even IPv4 addresses now are stored in IPv6 format, we also store the corresponding prefix length in that format, i.e. using the range [96,128] instead of [0,32].
Signed-off-by: Jon Maloy
--- v2: -Using inany_addr instead of protocol specific addresses as entry address field.
v3: -Merging into one array, directly in struct ctx -Changed prefix_len and flags fields in struct inany_addr_entry to uint8_t, since that makes the struct directly migratable --- arp.c | 5 +- conf.c | 145 ++++++++++++++++++++++++++++++++----------------------- conf.h | 6 +++ dhcp.c | 8 +-- dhcpv6.c | 6 +-- fwd.c | 17 ++++--- ip.c | 11 +++-- ip.h | 5 ++ ndp.c | 6 +-- passt.h | 123 +++++++++++++++++++++++++++++++++++++++++++--- pasta.c | 15 ++++-- tap.c | 5 +- 12 files changed, 251 insertions(+), 101 deletions(-)
diff --git a/arp.c b/arp.c index bb042e9..f16beac 100644 --- a/arp.c +++ b/arp.c @@ -54,7 +54,8 @@ static bool ignore_arp(const struct ctx *c, return true;
/* Don't resolve the guest's assigned address, either. */ - if (!memcmp(am->tip, &c->ip4.addr, sizeof(am->tip))) + if (first_v4(c) &&
first_v4() isn't a great name as a global function. But then I guess later patches in the series will remove most of the uses, so it might be ok in the interim.
+ !memcmp(am->tip, inany_v4(&first_v4(c)->addr), sizeof(am->tip))) return true;
return false; @@ -145,7 +146,7 @@ void arp_send_init_req(const struct ctx *c) memcpy(req.am.sha, c->our_tap_mac, sizeof(req.am.sha)); memcpy(req.am.sip, &c->ip4.our_tap_addr, sizeof(req.am.sip)); memcpy(req.am.tha, MAC_BROADCAST, sizeof(req.am.tha)); - memcpy(req.am.tip, &c->ip4.addr, sizeof(req.am.tip)); + memcpy(req.am.tip, inany_v4(&first_v4(c)->addr), sizeof(req.am.tip));
debug("Sending initial ARP request for guest MAC address"); tap_send_single(c, &req, sizeof(req)); diff --git a/conf.c b/conf.c index 5188c02..bb6bcf8 100644 --- a/conf.c +++ b/conf.c @@ -56,7 +56,7 @@ #define IP4_LL_GUEST_GW (struct in_addr){ htonl_constant(0xa9fe0202) } /* 169.254.2.2, libslirp default: 10.0.2.2 */
-#define IP4_LL_PREFIX_LEN 16 +#define IP4_LL_PREFIX_LEN 112 /* 96 + 16, in IPv6 format */
I don't think the encoding should be changed without also changing the #define name.
#define IP6_LL_GUEST_GW (struct in6_addr) \ {{{ 0xfe, 0x80, 0, 0, 0, 0, 0, 0, \ @@ -693,11 +693,11 @@ static int conf_ip4_prefix(const char *arg) /** * conf_ip4() - Verify or detect IPv4 support, get relevant addresses * @ifi: Host interface to attempt (0 to determine one) - * @ip4: IPv4 context (will be written) + * @c: Execution context (will be written)
Putting @c as the first parameter is a pretty strong convention.
* * Return: interface index for IPv4, or 0 on failure. */ -static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4) +static unsigned int conf_ip4(unsigned int ifi, struct ctx *c) { if (!ifi) ifi = nl_get_ext_if(nl_sock, AF_INET); @@ -707,9 +707,9 @@ static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4) return 0; }
- if (IN4_IS_ADDR_UNSPECIFIED(&ip4->guest_gw)) { + if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.guest_gw)) { int rc = nl_route_get_def(nl_sock, ifi, AF_INET, - &ip4->guest_gw); + &c->ip4.guest_gw); if (rc < 0) { debug("Couldn't discover IPv4 gateway address: %s", strerror_(-rc)); @@ -717,60 +717,57 @@ static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4) } }
- if (IN4_IS_ADDR_UNSPECIFIED(&ip4->addr)) { + if (!count_v4(c)) {
Do you need count_v4() at this stage of the series? Testing first_v4() would work just as well here, and so far this is the only caller of count_v4().
+ struct in_addr addr = {0,}; + int prefix_len = 0; int rc = nl_addr_get(nl_sock, ifi, AF_INET, - &ip4->addr, &ip4->prefix_len, NULL); + &addr, &prefix_len, NULL); if (rc < 0) { debug("Couldn't discover IPv4 address: %s", strerror_(-rc)); return 0; } - } + if (IN4_IS_ADDR_UNSPECIFIED(&addr)) + return 0;
- if (!ip4->prefix_len) { - in_addr_t addr = ntohl(ip4->addr.s_addr); - if (IN_CLASSA(addr)) - ip4->prefix_len = (32 - IN_CLASSA_NSHIFT); - else if (IN_CLASSB(addr)) - ip4->prefix_len = (32 - IN_CLASSB_NSHIFT); - else if (IN_CLASSC(addr)) - ip4->prefix_len = (32 - IN_CLASSC_NSHIFT); - else - ip4->prefix_len = 32; + c->addrs[c->addr_count].addr = inany_from_v4(addr); + c->addrs[c->addr_count].prefix_len = prefix_len + 96; + c->addrs[c->addr_count].flags = INANY_ADDR_HOST; + c->addr_count++;
I suspect an add_prefix() type function will be worth it, by the time we're done.
+ c->ip4.addr_seen = addr; }
- ip4->addr_seen = ip4->addr; - - ip4->our_tap_addr = ip4->guest_gw; - - if (IN4_IS_ADDR_UNSPECIFIED(&ip4->addr)) - return 0; + c->ip4.our_tap_addr = c->ip4.guest_gw;
return ifi; }
/** * conf_ip4_local() - Configure IPv4 addresses and attributes for local mode - * @ip4: IPv4 context (will be written) + * @c: Execution context (will be written) */ -static void conf_ip4_local(struct ip4_ctx *ip4) +static void conf_ip4_local(struct ctx *c) { - ip4->addr_seen = ip4->addr = IP4_LL_GUEST_ADDR; - ip4->our_tap_addr = ip4->guest_gw = IP4_LL_GUEST_GW; - ip4->prefix_len = IP4_LL_PREFIX_LEN; + c->addrs[c->addr_count].addr = inany_from_v4(IP4_LL_GUEST_ADDR); + c->ip4.addr_seen = *inany_v4(&c->addrs[c->addr_count].addr); + c->ip4.our_tap_addr = c->ip4.guest_gw = IP4_LL_GUEST_GW; + c->addrs[c->addr_count].prefix_len = IP4_LL_PREFIX_LEN;
I'd add the 96 here, rather than putting it into the constant.
+ c->addr_count++;
- ip4->no_copy_addrs = ip4->no_copy_routes = true; + c->ip4.no_copy_addrs = c->ip4.no_copy_routes = true; }
/** * conf_ip6() - Verify or detect IPv6 support, get relevant addresses * @ifi: Host interface to attempt (0 to determine one) - * @ip6: IPv6 context (will be written) + * @c: Execution context (will be written) * * Return: interface index for IPv6, or 0 on failure. */ -static unsigned int conf_ip6(unsigned int ifi, struct ip6_ctx *ip6) +static unsigned int conf_ip6(unsigned int ifi, struct ctx *c) { + struct inany_addr_entry *e; + union inany_addr addr = {0,}; int prefix_len = 0; int rc;
@@ -782,8 +779,8 @@ static unsigned int conf_ip6(unsigned int ifi, struct ip6_ctx *ip6) return 0; }
- if (IN6_IS_ADDR_UNSPECIFIED(&ip6->guest_gw)) { - rc = nl_route_get_def(nl_sock, ifi, AF_INET6, &ip6->guest_gw); + if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.guest_gw)) { + rc = nl_route_get_def(nl_sock, ifi, AF_INET6, &c->ip6.guest_gw); if (rc < 0) { debug("Couldn't discover IPv6 gateway address: %s", strerror_(-rc)); @@ -791,21 +788,28 @@ static unsigned int conf_ip6(unsigned int ifi, struct ip6_ctx *ip6) } }
- rc = nl_addr_get(nl_sock, ifi, AF_INET6, - IN6_IS_ADDR_UNSPECIFIED(&ip6->addr) ? &ip6->addr : NULL, - &prefix_len, &ip6->our_tap_ll); + rc = nl_addr_get(nl_sock, ifi, AF_INET6, &addr.a6, + &prefix_len, &c->ip6.our_tap_ll); if (rc < 0) { debug("Couldn't discover IPv6 address: %s", strerror_(-rc)); return 0; }
- ip6->addr_seen = ip6->addr; + if (!count_v6(c)) {
Same comment as for count_v4().
+ c->addrs[c->addr_count].addr = addr; + c->addrs[c->addr_count].prefix_len = prefix_len ? prefix_len : 64; + c->addrs[c->addr_count].flags = INANY_ADDR_HOST; + c->addr_count++; + } + + e = first_v6(c); + c->ip6.addr_seen = e->addr.a6;
- if (IN6_IS_ADDR_LINKLOCAL(&ip6->guest_gw)) - ip6->our_tap_ll = ip6->guest_gw; + if (IN6_IS_ADDR_LINKLOCAL(&c->ip6.guest_gw)) + c->ip6.our_tap_ll = c->ip6.guest_gw;
- if (IN6_IS_ADDR_UNSPECIFIED(&ip6->addr) || - IN6_IS_ADDR_UNSPECIFIED(&ip6->our_tap_ll)) + if (IN6_IS_ADDR_UNSPECIFIED(&e->addr.a6) || + IN6_IS_ADDR_UNSPECIFIED(&c->ip6.our_tap_ll)) return 0;
return ifi; @@ -813,13 +817,13 @@ static unsigned int conf_ip6(unsigned int ifi, struct ip6_ctx *ip6)
/** * conf_ip6_local() - Configure IPv6 addresses and attributes for local mode - * @ip6: IPv6 context (will be written) + * @c: Execution context (will be written) */ -static void conf_ip6_local(struct ip6_ctx *ip6) +static void conf_ip6_local(struct ctx *c) { - ip6->our_tap_ll = ip6->guest_gw = IP6_LL_GUEST_GW; + c->ip6.our_tap_ll = c->ip6.guest_gw = IP6_LL_GUEST_GW;
- ip6->no_copy_addrs = ip6->no_copy_routes = true; + c->ip6.no_copy_addrs = c->ip6.no_copy_routes = true; }
/** @@ -1145,13 +1149,15 @@ static void conf_print(const struct ctx *c) buf4, sizeof(buf4)));
if (!c->no_dhcp) { + struct inany_addr_entry *e = first_v4(c);
You never test for e being NULL in this path. Maybe that's not possible unless c->no_dhcp, but that's pretty non-obvious from here.
uint32_t mask;
- mask = IN4_MASK(c->ip4.prefix_len); + mask = IN4_MASK(e->prefix_len);
info("DHCP:"); info(" assign: %s", - inet_ntop(AF_INET, &c->ip4.addr, buf4, sizeof(buf4))); + inet_ntop(AF_INET, inany_v4(&e->addr), + buf4, sizeof(buf4))); info(" mask: %s", inet_ntop(AF_INET, &mask, buf4, sizeof(buf4))); info(" router: %s", @@ -1189,7 +1195,8 @@ static void conf_print(const struct ctx *c) goto dns6;
info(" assign: %s", - inet_ntop(AF_INET6, &c->ip6.addr, buf6, sizeof(buf6))); + inet_ntop(AF_INET6, &first_v6(c)->addr.a6,
Similarly no test for NULL first_v6() before dereferencing.
+ buf6, sizeof(buf6))); info(" router: %s", inet_ntop(AF_INET6, &c->ip6.guest_gw, buf6, sizeof(buf6))); info(" our link-local: %s", @@ -1811,6 +1818,7 @@ void conf(struct ctx *c, int argc, char **argv) } case 'a': { union inany_addr addr = { 0 }; + struct inany_addr_entry *e; int prefix_len = 0; int af;
@@ -1828,13 +1836,20 @@ void conf(struct ctx *c, int argc, char **argv) die("Can't mix CIDR with -n");
if (af == AF_INET) { - c->ip4.addr = *inany_v4(&addr); - c->ip4.prefix_len = prefix_len ? prefix_len - 96 : - ip4_class_prefix_len(&c->ip4.addr); + e = &c->addrs[c->addr_count]; + e->addr = addr; + e->prefix_len = prefix_len ? prefix_len : + ip4_class_prefix_len(inany_v4(&addr)); + e->flags = INANY_ADDR_CONFIGURED; + c->addr_count++; if (c->mode == MODE_PASTA) c->ip4.no_copy_addrs = true; } else if (af == AF_INET6) { - c->ip6.addr = addr.a6; + e = &c->addrs[c->addr_count]; + e->addr = addr; + e->prefix_len = prefix_len ? prefix_len : 64; + e->flags = INANY_ADDR_CONFIGURED; + c->addr_count++; if (c->mode == MODE_PASTA) c->ip6.no_copy_addrs = true; } else { @@ -1842,14 +1857,21 @@ void conf(struct ctx *c, int argc, char **argv) } break; } - case 'n': + case 'n': { + struct inany_addr_entry *e; + int plen; + if (prefix_from_cidr) die("Can't use both -n and CIDR prefix length"); - c->ip4.prefix_len = conf_ip4_prefix(optarg); - if (c->ip4.prefix_len < 0) + plen = conf_ip4_prefix(optarg); + if (plen < 0) die("Invalid netmask: %s", optarg); + e = first_v4(c); + if (e) + e->prefix_len = plen + 96; prefix_from_opt = true; break; + } case 'M': parse_mac(c->our_tap_mac, optarg); break; @@ -1993,9 +2015,9 @@ void conf(struct ctx *c, int argc, char **argv)
nl_sock_init(c, false); if (!v6_only) - c->ifi4 = conf_ip4(ifi4, &c->ip4); + c->ifi4 = conf_ip4(ifi4, c); if (!v4_only) - c->ifi6 = conf_ip6(ifi6, &c->ip6); + c->ifi6 = conf_ip6(ifi6, c);
if (c->ifi4 && c->mtu < IPV4_MIN_MTU) { warn("MTU %"PRIu16" is too small for IPv4 (minimum %u)", @@ -2018,14 +2040,14 @@ void conf(struct ctx *c, int argc, char **argv) if (!c->ifi4 && !v6_only) { info("IPv4: no external interface as template, use local mode");
- conf_ip4_local(&c->ip4); + conf_ip4_local(c); c->ifi4 = -1; }
if (!c->ifi6 && !v4_only) { info("IPv6: no external interface as template, use local mode");
- conf_ip6_local(&c->ip6); + conf_ip6_local(c); c->ifi6 = -1; }
@@ -2134,7 +2156,8 @@ void conf(struct ctx *c, int argc, char **argv) if (!c->ifi6) { c->no_ndp = 1; c->no_dhcpv6 = 1; - } else if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) { + } else if (!first_v6(c) || + IN6_IS_ADDR_UNSPECIFIED(&first_v6(c)->addr.a6)) { c->no_dhcpv6 = 1; }
diff --git a/conf.h b/conf.h index b45ad74..e86fe96 100644 --- a/conf.h +++ b/conf.h @@ -6,6 +6,12 @@ #ifndef CONF_H #define CONF_H
+/* Flags indicating origin and role of an address + * To be used in struct inany_addr_entry + */ +#define CONF_ADDR_CONFIGURED (1 << 0) /* User set via -a */
As Stefano said, I think "USER" or "CMDLINE" would be preferable to "CONFIGURED".
+#define CONF_ADDR_HOST (1 << 1) /* From host interface */ + enum passt_modes conf_mode(int argc, char *argv[]); void conf(struct ctx *c, int argc, char **argv);
diff --git a/dhcp.c b/dhcp.c index c552f01..933a8cf 100644 --- a/dhcp.c +++ b/dhcp.c @@ -352,7 +352,7 @@ int dhcp(const struct ctx *c, struct iov_tail *data) reply.secs = 0; reply.flags = m->flags; reply.ciaddr = m->ciaddr; - reply.yiaddr = c->ip4.addr; + reply.yiaddr = *inany_v4(&first_v4(c)->addr);
Again, maybe it's impossible for first_v4() to be NULL if !no_dhcp, but that's not obvious from here. If it's your intention to rely on that, an ASSERT would make sense.
reply.siaddr = 0; reply.giaddr = m->giaddr; memcpy(&reply.chaddr, m->chaddr, sizeof(reply.chaddr)); @@ -404,7 +404,7 @@ int dhcp(const struct ctx *c, struct iov_tail *data)
info(" from %s", eth_ntop(m->chaddr, macstr, sizeof(macstr)));
- mask.s_addr = IN4_MASK(c->ip4.prefix_len); + mask.s_addr = IN4_MASK(first_v4(c)->prefix_len);
- 96, since the entry has an IPv6 encoded prefix length. Or maybe it would be nicer to replace first_v4() with something that extracts the first v4 address calling inany_v4() for you, and also extracts the matching prefix, doing the re-encoding to IPv4 format.
memcpy(opts[1].s, &mask, sizeof(mask)); memcpy(opts[3].s, &c->ip4.guest_gw, sizeof(c->ip4.guest_gw)); memcpy(opts[54].s, &c->ip4.our_tap_addr, sizeof(c->ip4.our_tap_addr)); @@ -412,7 +412,7 @@ int dhcp(const struct ctx *c, struct iov_tail *data) /* If the gateway is not on the assigned subnet, send an option 121 * (Classless Static Routing) adding a dummy route to it. */ - if ((c->ip4.addr.s_addr & mask.s_addr) + if ((inany_v4(&first_v4(c)->addr)->s_addr & mask.s_addr) != (c->ip4.guest_gw.s_addr & mask.s_addr)) { /* a.b.c.d/32:0.0.0.0, 0:a.b.c.d */ opts[121].slen = 14; @@ -469,7 +469,7 @@ int dhcp(const struct ctx *c, struct iov_tail *data) if (m->flags & FLAG_BROADCAST) dst = in4addr_broadcast; else - dst = c->ip4.addr; + dst = *inany_v4(&first_v4(c)->addr);
tap_udp4_send(c, c->ip4.our_tap_addr, 67, dst, 68, &reply, dlen);
diff --git a/dhcpv6.c b/dhcpv6.c index e4df0db..a1e5f15 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -625,7 +625,7 @@ int dhcpv6(struct ctx *c, struct iov_tail *data, if (mh->type == TYPE_CONFIRM && server_id) return -1;
- if (dhcpv6_ia_notonlink(data, &c->ip6.addr)) { + if (dhcpv6_ia_notonlink(data, &first_v6(c)->addr.a6)) {
Likewise a helper that returns the first v6 address *as* a struct in6_addr might be neater.
dhcpv6_send_ia_notonlink(c, data, &client_id_base, ntohs(client_id->l), mh->xid); @@ -679,7 +679,7 @@ int dhcpv6(struct ctx *c, struct iov_tail *data,
tap_udp6_send(c, src, 547, tap_ip6_daddr(c, src), 546, mh->xid, &resp, n); - c->ip6.addr_seen = c->ip6.addr; + c->ip6.addr_seen = first_v6(c)->addr.a6;
return 1; } @@ -703,5 +703,5 @@ void dhcpv6_init(const struct ctx *c) memcpy(resp_not_on_link.server_id.duid_lladdr, c->our_tap_mac, sizeof(c->our_tap_mac));
- resp.ia_addr.addr = c->ip6.addr; + resp.ia_addr.addr = first_v6(c)->addr.a6; } diff --git a/fwd.c b/fwd.c index 44a0e10..54248a3 100644 --- a/fwd.c +++ b/fwd.c @@ -516,7 +516,8 @@ static bool fwd_guest_accessible4(const struct ctx *c, /* For IPv4, addr_seen is initialised to addr, so is always a valid * address */ - if (IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr) || + if ((first_v4(c) && + IN4_ARE_ADDR_EQUAL(addr, inany_v4(&first_v4(c)->addr))) || IN4_ARE_ADDR_EQUAL(addr, &c->ip4.addr_seen)) return false;
@@ -537,7 +538,7 @@ static bool fwd_guest_accessible6(const struct ctx *c, if (IN6_IS_ADDR_LOOPBACK(addr)) return false;
- if (IN6_ARE_ADDR_EQUAL(addr, &c->ip6.addr)) + if (first_v6(c) && IN6_ARE_ADDR_EQUAL(addr, &first_v6(c)->addr.a6)) return false;
/* For IPv6, addr_seen starts unspecified, because we don't know what LL @@ -586,10 +587,10 @@ static void nat_outbound(const struct ctx *c, const union inany_addr *addr, *translated = inany_loopback4; else if (inany_equals6(addr, &c->ip6.map_host_loopback)) *translated = inany_loopback6; - else if (inany_equals4(addr, &c->ip4.map_guest_addr)) - *translated = inany_from_v4(c->ip4.addr); - else if (inany_equals6(addr, &c->ip6.map_guest_addr)) - translated->a6 = c->ip6.addr; + else if (first_v4(c) && inany_equals4(addr, &c->ip4.map_guest_addr)) + *translated = first_v4(c)->addr; + else if (first_v6(c) && inany_equals6(addr, &c->ip6.map_guest_addr)) + translated->a6 = first_v6(c)->addr.a6;
Implementing the forward table for the tap pif will make this cleaner, when I get to it.
else *translated = *addr; } @@ -710,10 +711,10 @@ bool nat_inbound(const struct ctx *c, const union inany_addr *addr, inany_equals6(addr, &in6addr_loopback)) { translated->a6 = c->ip6.map_host_loopback; } else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.map_guest_addr) && - inany_equals4(addr, &c->ip4.addr)) { + first_v4(c) && inany_equals(addr, &first_v4(c)->addr)) { *translated = inany_from_v4(c->ip4.map_guest_addr); } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.map_guest_addr) && - inany_equals6(addr, &c->ip6.addr)) { + first_v6(c) && inany_equals6(addr, &first_v6(c)->addr.a6)) { translated->a6 = c->ip6.map_guest_addr; } else if (fwd_guest_accessible(c, addr)) { *translated = *addr; diff --git a/ip.c b/ip.c index 40dc24e..a86753f 100644 --- a/ip.c +++ b/ip.c @@ -74,17 +74,18 @@ found: * ip4_class_prefix_len() - Get class based prefix length for IPv4 address * @addr: IPv4 address * - * Return: prefix length based on address class, or 32 for other + * Return: prefix length in IPv6 format (96 + IPv4 prefix) based on address + * class, or 128 for other
Here again, I don't think you should change the encoding without changing the function name. As I've said, I think we should consistently use IPv6 encoding *when dealing with an inany*. But functions that deal specifically with IPv4 addresses, should use the IPv4 prefix length encoding.
*/ int ip4_class_prefix_len(const struct in_addr *addr) { in_addr_t a = ntohl(addr->s_addr);
if (IN_CLASSA(a)) - return 32 - IN_CLASSA_NSHIFT; + return 96 + 32 - IN_CLASSA_NSHIFT; if (IN_CLASSB(a)) - return 32 - IN_CLASSB_NSHIFT; + return 96 + 32 - IN_CLASSB_NSHIFT; if (IN_CLASSC(a)) - return 32 - IN_CLASSC_NSHIFT; - return 32; + return 96 + 32 - IN_CLASSC_NSHIFT; + return 128; } diff --git a/ip.h b/ip.h index c829d84..91d6d9a 100644 --- a/ip.h +++ b/ip.h @@ -137,6 +137,11 @@ static const struct in_addr in4addr_broadcast = { 0xffffffff }; #define IPV6_MIN_MTU 1280 #endif
+/* Maximum number of addresses per address family */ +#define IP4_MAX_ADDRS 16 +#define IP6_MAX_ADDRS 16
Separate defines here don't really make sense now you have a single address array.
+#define INANY_MAX_ADDRS (IP4_MAX_ADDRS + IP6_MAX_ADDRS) + int ip4_class_prefix_len(const struct in_addr *addr);
#endif /* IP_H */ diff --git a/ndp.c b/ndp.c index eb9e313..f60c298 100644 --- a/ndp.c +++ b/ndp.c @@ -257,7 +257,7 @@ static void ndp_ra(const struct ctx *c, const struct in6_addr *dst) .valid_lifetime = ~0U, .pref_lifetime = ~0U, }, - .prefix = c->ip6.addr, + .prefix = first_v6(c)->addr.a6, .source_ll = { .header = { .type = OPT_SRC_L2_ADDR, @@ -466,8 +466,8 @@ void ndp_send_init_req(const struct ctx *c) .icmp6_solicited = 0, /* Reserved */ .icmp6_override = 0, /* Reserved */ }, - .target_addr = c->ip6.addr + .target_addr = first_v6(c)->addr.a6 }; debug("Sending initial NDP NS request for guest MAC address"); - ndp_send(c, &c->ip6.addr, &ns, sizeof(ns)); + ndp_send(c, &first_v6(c)->addr.a6, &ns, sizeof(ns)); } diff --git a/passt.h b/passt.h index 79d01dd..ec5517d 100644 --- a/passt.h +++ b/passt.h @@ -64,11 +64,42 @@ enum passt_modes { MODE_VU, };
+/* Flags indicating origin and role of an address + * in struct inany_addr_entry + */ +#define INANY_ADDR_CONFIGURED BIT(0) /* User set via -a */ +#define INANY_ADDR_HOST BIT(1) /* From host interface */
These appear to be redundant with with CONF_ADDR_*. These are specific to how we handle configuration, not general properties of a prefix, so I don't think these ones belong.
+ +/** + * for_each_addr() - Iterate over addresses in unified array + * @c: Pointer to struct ctx + * @e: Pointer variable for current entry (struct inany_addr_entry *)
Wll, @c going first is a strongish convention, but the loop variable going first in a for_each_() macro is an even stronger convention.
+ * @af: Address family filter: AF_INET, AF_INET6, or 0 for all
Iterating over just one address type is going to become less and less common as we unify handling in various places. So it might be simpler to just have a single iterator, and explicitly filter the "wrong" type in the places we need to.
+ * + * Note: @_i is the internal loop counter, uses _next_addr_idx() helper + */ +#define for_each_addr(c, e, af) \ + for (int _i = _next_addr_idx((c), 0, (af)); \ + _i < (c)->addr_count && ((e) = &(c)->addrs[_i], true); \ + _i = _next_addr_idx((c), _i + 1, (af))) + +/* first_v4(), first_v6(), count_v4(), count_v6() defined after struct ctx */ + +/** + * struct inany_addr_entry - Unified IPv4/IPv6 address entry + * @addr: IPv4 (as mapped) or IPv6 address + * @prefix_len: Prefix length in IPv6/IPv4-mapped [0,128]/[96,128] format + * @flags: INANY_ADDR_* flags + */ +struct inany_addr_entry { + union inany_addr addr; + uint8_t prefix_len; + uint8_t flags; +}; + /** * struct ip4_ctx - IPv4 execution context - * @addr: IPv4 address assigned to guest * @addr_seen: Latest IPv4 address seen as source from tap - * @prefixlen: IPv4 prefix length (netmask) * @guest_gw: IPv4 gateway as seen by the guest * @map_host_loopback: Outbound connections to this address are NATted to the * host's 127.0.0.1 @@ -84,10 +115,7 @@ enum passt_modes { * @no_copy_addrs: Don't copy all addresses when configuring namespace */ struct ip4_ctx { - /* PIF_TAP addresses */ - struct in_addr addr; struct in_addr addr_seen; - int prefix_len; struct in_addr guest_gw; struct in_addr map_host_loopback; struct in_addr map_guest_addr; @@ -107,7 +135,6 @@ struct ip4_ctx {
/** * struct ip6_ctx - IPv6 execution context - * @addr: IPv6 address assigned to guest * @addr_seen: Latest IPv6 global/site address seen as source from tap * @addr_ll_seen: Latest IPv6 link-local address seen as source from tap * @guest_gw: IPv6 gateway as seen by the guest @@ -125,8 +152,6 @@ struct ip4_ctx { * @no_copy_addrs: Don't copy all addresses when configuring namespace */ struct ip6_ctx { - /* PIF_TAP addresses */ - struct in6_addr addr; struct in6_addr addr_seen; struct in6_addr addr_ll_seen; struct in6_addr guest_gw; @@ -257,6 +282,10 @@ struct ctx { int ifi6; struct ip6_ctx ip6;
+ /* Unified address array for both IPv4 (mapped) and IPv6 */ + struct inany_addr_entry addrs[INANY_MAX_ADDRS]; + int addr_count; +
THe new fields need to be added to the structure comment
char pasta_ifn[IF_NAMESIZE]; unsigned int pasta_ifi; int pasta_conf_ns; @@ -294,6 +323,84 @@ struct ctx { bool migrate_exit; };
+/** + * first_v4() - Get first IPv4 address entry + * @c: Pointer to struct ctx + * + * Return: pointer to first IPv4 entry, or NULL if none + */ +static inline struct inany_addr_entry *first_v4(const struct ctx *c) +{ + for (int i = 0; i < c->addr_count; i++) + if (inany_v4(&c->addrs[i].addr)) + return (struct inany_addr_entry *)&c->addrs[i]; + return NULL; +} + +/** + * first_v6() - Get first IPv6 address entry + * @c: Pointer to struct ctx + * + * Return: pointer to first IPv6 entry, or NULL if none + */ +static inline struct inany_addr_entry *first_v6(const struct ctx *c) +{ + for (int i = 0; i < c->addr_count; i++) + if (!inany_v4(&c->addrs[i].addr)) + return (struct inany_addr_entry *)&c->addrs[i]; + return NULL; +} + +/** + * count_v4() - Count IPv4 addresses + * @c: Pointer to struct ctx + * + * Return: number of IPv4 addresses + */ +static inline int count_v4(const struct ctx *c) +{ + int count = 0; + + for (int i = 0; i < c->addr_count; i++) + if (inany_v4(&c->addrs[i].addr)) + count++; + return count; +} + +/** + * count_v6() - Count IPv6 addresses + * @c: Pointer to struct ctx + * + * Return: number of IPv6 addresses + */ +static inline int count_v6(const struct ctx *c) +{ + int count = 0; + + for (int i = 0; i < c->addr_count; i++) + if (!inany_v4(&c->addrs[i].addr)) + count++; + return count; +} + +/** + * _next_addr_idx() - Find next address index matching family filter + * @c: Pointer to struct ctx + * @i: Starting index + * @af: Address family filter: AF_INET, AF_INET6, or 0 for all + * + * Return: next matching index, or addr_count if none found + */ +static inline int _next_addr_idx(const struct ctx *c, int i, int af) +{ + while (i < c->addr_count) { + if (!af || ((af == AF_INET) == !!inany_v4(&c->addrs[i].addr))) + return i; + i++; + } + return i; +}
This is a pretty fiddly way of doing the iteration. Simpler to have a plain loop through the array, and skip the entries you don't want.
+ void proto_update_l2_buf(const unsigned char *eth_d);
#endif /* PASST_H */ diff --git a/pasta.c b/pasta.c index c307b8a..08f35f4 100644 --- a/pasta.c +++ b/pasta.c @@ -338,10 +338,12 @@ void pasta_ns_conf(struct ctx *c)
if (c->ifi4) { if (c->ip4.no_copy_addrs) { + struct inany_addr_entry *e = first_v4(c);
NULL check?
+ rc = nl_addr_set(nl_sock_ns, c->pasta_ifi, AF_INET, - &c->ip4.addr, - c->ip4.prefix_len); + inany_v4(&e->addr), + e->prefix_len); } else { rc = nl_addr_dup(nl_sock, c->ifi4, nl_sock_ns, c->pasta_ifi, @@ -387,10 +389,13 @@ void pasta_ns_conf(struct ctx *c) 0, IFF_NOARP);
if (c->ip6.no_copy_addrs) { - if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) { + struct inany_addr_entry *e = first_v6(c); + + if (e && !IN6_IS_ADDR_UNSPECIFIED(&e->addr.a6)) {
If it was UNSPECIFIED, then first_v6() would return NULL, no?
rc = nl_addr_set(nl_sock_ns, - c->pasta_ifi, AF_INET6, - &c->ip6.addr, 64); + c->pasta_ifi, + AF_INET6, + &e->addr.a6, 64); } } else { rc = nl_addr_dup(nl_sock, c->ifi6, diff --git a/tap.c b/tap.c index 9d1344b..f3073c6 100644 --- a/tap.c +++ b/tap.c @@ -951,8 +951,9 @@ resume: c->ip6.addr_seen = *saddr; }
- if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr)) - c->ip6.addr = *saddr; + if (first_v6(c) && + IN6_IS_ADDR_UNSPECIFIED(&first_v6(c)->addr.a6))
Again, if the address were unspecified, first_v6() would return NULL, wouldn't it?
+ first_v6(c)->addr.a6 = *saddr; } else if (!IN6_IS_ADDR_UNSPECIFIED(saddr)){ c->ip6.addr_seen = *saddr; } -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Fri, Jan 30, 2026 at 04:44:42PM -0500, Jon Maloy wrote:
Extract the IPv4 and IPv6 namespace configuration code from pasta_ns_conf() into separate static functions. This reduces indentation depth and prepares for adding multi-address support.
No functional change.
This seems fine in isolation, but I think it moves in the wrong direction in context. With the unified address array, we want to step through and set all the addresses in a single pass - this kind of works against that.
Signed-off-by: Jon Maloy
--- pasta.c | 182 ++++++++++++++++++++++++++++++-------------------------- 1 file changed, 96 insertions(+), 86 deletions(-) diff --git a/pasta.c b/pasta.c index 08f35f4..de0ba14 100644 --- a/pasta.c +++ b/pasta.c @@ -303,6 +303,98 @@ void pasta_start_ns(struct ctx *c, uid_t uid, gid_t gid, die_perror("Failed to join network namespace"); }
+/** + * pasta_ns_conf_ip4() - Configure IPv4 in namespace + * @c: Execution context + */ +static void pasta_ns_conf_ip4(struct ctx *c) +{ + int rc = 0; + + if (c->ip4.no_copy_addrs) { + struct inany_addr_entry *e = first_v4(c); + + rc = nl_addr_set(nl_sock_ns, c->pasta_ifi, AF_INET, + inany_v4(&e->addr), e->prefix_len - 96); + } else { + rc = nl_addr_dup(nl_sock, c->ifi4, + nl_sock_ns, c->pasta_ifi, AF_INET); + } + + if (rc < 0) { + die("Couldn't set IPv4 address(es) in namespace: %s", + strerror_(-rc)); + } + + if (c->ip4.no_copy_routes) { + rc = nl_route_set_def(nl_sock_ns, c->pasta_ifi, + AF_INET, &c->ip4.guest_gw); + } else { + rc = nl_route_dup(nl_sock, c->ifi4, nl_sock_ns, + c->pasta_ifi, AF_INET); + } + + if (rc < 0) { + die("Couldn't set IPv4 route(s) in guest: %s", + strerror_(-rc)); + } +} + +/** + * pasta_ns_conf_ip6() - Configure IPv6 in namespace + * @c: Execution context + */ +static void pasta_ns_conf_ip6(struct ctx *c) +{ + struct inany_addr_entry *e; + int rc = 0; + + rc = nl_addr_get_ll(nl_sock_ns, c->pasta_ifi, &c->ip6.addr_ll_seen); + if (rc < 0) { + warn("Can't get LL address from namespace: %s", + strerror_(-rc)); + } + + rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi); + if (rc < 0) { + warn("Can't set nodad for LL in namespace: %s", + strerror_(-rc)); + } + + /* We dodged DAD: re-enable neighbour solicitations */ + nl_link_set_flags(nl_sock_ns, c->pasta_ifi, 0, IFF_NOARP); + + if (c->ip6.no_copy_addrs) { + e = first_v6(c); + + if (e && !IN6_IS_ADDR_UNSPECIFIED(&e->addr.a6)) { + rc = nl_addr_set(nl_sock_ns, c->pasta_ifi, + AF_INET6, &e->addr.a6, 64); + } + } else { + rc = nl_addr_dup(nl_sock, c->ifi6, + nl_sock_ns, c->pasta_ifi, AF_INET6); + } + + if (rc < 0) { + die("Couldn't set IPv6 address(es) in namespace: %s", + strerror_(-rc)); + } + + if (c->ip6.no_copy_routes) { + rc = nl_route_set_def(nl_sock_ns, c->pasta_ifi, + AF_INET6, &c->ip6.guest_gw); + } else { + rc = nl_route_dup(nl_sock, c->ifi6, + nl_sock_ns, c->pasta_ifi, AF_INET6); + } + + if (rc < 0) { + die("Couldn't set IPv6 route(s) in guest: %s", + strerror_(-rc)); + } +} + /** * pasta_ns_conf() - Set up loopback and tap interfaces in namespace as needed * @c: Execution context @@ -336,93 +428,11 @@ void pasta_ns_conf(struct ctx *c)
nl_link_set_flags(nl_sock_ns, c->pasta_ifi, flags, flags);
- if (c->ifi4) { - if (c->ip4.no_copy_addrs) { - struct inany_addr_entry *e = first_v4(c); - - rc = nl_addr_set(nl_sock_ns, c->pasta_ifi, - AF_INET, - inany_v4(&e->addr), - e->prefix_len); - } else { - rc = nl_addr_dup(nl_sock, c->ifi4, - nl_sock_ns, c->pasta_ifi, - AF_INET); - } - - if (rc < 0) { - die("Couldn't set IPv4 address(es) in namespace: %s", - strerror_(-rc)); - } - - if (c->ip4.no_copy_routes) { - rc = nl_route_set_def(nl_sock_ns, c->pasta_ifi, - AF_INET, - &c->ip4.guest_gw); - } else { - rc = nl_route_dup(nl_sock, c->ifi4, nl_sock_ns, - c->pasta_ifi, AF_INET); - } - - if (rc < 0) { - die("Couldn't set IPv4 route(s) in guest: %s", - strerror_(-rc)); - } - } + if (c->ifi4) + pasta_ns_conf_ip4(c);
- if (c->ifi6) { - rc = nl_addr_get_ll(nl_sock_ns, c->pasta_ifi, - &c->ip6.addr_ll_seen); - if (rc < 0) { - warn("Can't get LL address from namespace: %s", - strerror_(-rc)); - } - - rc = nl_addr_set_ll_nodad(nl_sock_ns, c->pasta_ifi); - if (rc < 0) { - warn("Can't set nodad for LL in namespace: %s", - strerror_(-rc)); - } - - /* We dodged DAD: re-enable neighbour solicitations */ - nl_link_set_flags(nl_sock_ns, c->pasta_ifi, - 0, IFF_NOARP); - - if (c->ip6.no_copy_addrs) { - struct inany_addr_entry *e = first_v6(c); - - if (e && !IN6_IS_ADDR_UNSPECIFIED(&e->addr.a6)) { - rc = nl_addr_set(nl_sock_ns, - c->pasta_ifi, - AF_INET6, - &e->addr.a6, 64); - } - } else { - rc = nl_addr_dup(nl_sock, c->ifi6, - nl_sock_ns, c->pasta_ifi, - AF_INET6); - } - - if (rc < 0) { - die("Couldn't set IPv6 address(es) in namespace: %s", - strerror_(-rc)); - } - - if (c->ip6.no_copy_routes) { - rc = nl_route_set_def(nl_sock_ns, c->pasta_ifi, - AF_INET6, - &c->ip6.guest_gw); - } else { - rc = nl_route_dup(nl_sock, c->ifi6, - nl_sock_ns, c->pasta_ifi, - AF_INET6); - } - - if (rc < 0) { - die("Couldn't set IPv6 route(s) in guest: %s", - strerror_(-rc)); - } - } + if (c->ifi6) + pasta_ns_conf_ip6(c); }
proto_update_l2_buf(c->guest_mac); -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Fri, Jan 30, 2026 at 04:44:43PM -0500, Jon Maloy wrote:
We enable configuration of multiple IPv4 and IPv6 addresses by allowing repeated use of the -a/--address option.
- We update option parsing to append addresses to the unified addrs[] array, with limit checks for IP4_MAX_ADDRS and IP6_MAX_ADDRS.
I don't see any point to separate v4 and v6 limits, now there's a unified array.
- Each address specified via -a, but with no prefix length indicated, gets a class-based default prefix length.
That's not new, is it? I think we want to be careful with this. We need to maintain compatibility with previously working options, but I think we want to strongly encourage prefix lengths to be explicitly specified. Address classes are very anachronistic now, so I don't think we want to add any more uses of them than we have to for compatibility.
- If no -a option is given, addresses/prefix lengths are inherited from the template interface. - If a prefix length is to be added, it has to be done in CIDR format, except for the very first address.
I don't really follow what that means.
- We configure all indicated addresses in the namespace interface using the for_each_addr() macro.
Signed-off-by: Jon Maloy
--- v2: - Adapted to previous code changes v3: - Adapted to single-array strategy - Changes according to feedback from S. Brivio and G Gibson. --- conf.c | 23 ++++++++++++++++------- pasta.c | 21 ++++++++++++++------- 2 files changed, 30 insertions(+), 14 deletions(-)
diff --git a/conf.c b/conf.c index bb6bcf8..d73a3dd 100644 --- a/conf.c +++ b/conf.c @@ -803,13 +803,13 @@ static unsigned int conf_ip6(unsigned int ifi, struct ctx *c) }
e = first_v6(c); - c->ip6.addr_seen = e->addr.a6; + if (e) + c->ip6.addr_seen = e->addr.a6;
This seems like it belongs in an earlier patch.
if (IN6_IS_ADDR_LINKLOCAL(&c->ip6.guest_gw)) c->ip6.our_tap_ll = c->ip6.guest_gw;
- if (IN6_IS_ADDR_UNSPECIFIED(&e->addr.a6) || - IN6_IS_ADDR_UNSPECIFIED(&c->ip6.our_tap_ll))
Missed this in an earlier patch, but !!e and an unspecified address shouldn't be possible, no?
+ if (!count_v6(c) || IN6_IS_ADDR_UNSPECIFIED(&c->ip6.our_tap_ll)) return 0;
return ifi; @@ -901,9 +901,11 @@ static void usage(const char *name, FILE *f, int status) " default: 65520: maximum 802.3 MTU minus 802.3 header\n" " length, rounded to 32 bits (IPv4 words)\n" " -a, --address ADDR Assign IPv4 or IPv6 address ADDR[/PREFIXLEN]\n" - " can be specified zero to two times (for IPv4 and IPv6)\n" + " can be specified multiple times (limit: %d IPv4, %d IPv6)\n" " default: use addresses from interface with default route\n" - " -n, --netmask MASK Assign IPv4 MASK, dot-decimal or bits\n" + " -n, --netmask MASK Assign IPv4 MASK, dot-decimal or bits\n", + IP4_MAX_ADDRS, IP6_MAX_ADDRS); + FPRINTF(f, " default: netmask from matching address on the host\n" " -M, --mac-addr ADDR Use source MAC address ADDR\n" " default: 9a:55:9a:55:9a:55 (locally administered)\n" @@ -1836,6 +1838,9 @@ void conf(struct ctx *c, int argc, char **argv) die("Can't mix CIDR with -n");
if (af == AF_INET) { + if (count_v4(c) >= IP4_MAX_ADDRS) + die("Too many IPv4 addresses"); + e = &c->addrs[c->addr_count]; e->addr = addr; e->prefix_len = prefix_len ? prefix_len : @@ -1845,6 +1850,9 @@ void conf(struct ctx *c, int argc, char **argv) if (c->mode == MODE_PASTA) c->ip4.no_copy_addrs = true; } else if (af == AF_INET6) { + if (count_v6(c) >= IP6_MAX_ADDRS) + die("Too many IPv6 addresses"); + e = &c->addrs[c->addr_count]; e->addr = addr; e->prefix_len = prefix_len ? prefix_len : 64; @@ -1861,6 +1869,8 @@ void conf(struct ctx *c, int argc, char **argv) struct inany_addr_entry *e; int plen;
+ if (count_v4(c) > 1) + die("-n can only be used with first address"); if (prefix_from_cidr) die("Can't use both -n and CIDR prefix length"); plen = conf_ip4_prefix(optarg); @@ -2156,8 +2166,7 @@ void conf(struct ctx *c, int argc, char **argv) if (!c->ifi6) { c->no_ndp = 1; c->no_dhcpv6 = 1; - } else if (!first_v6(c) || - IN6_IS_ADDR_UNSPECIFIED(&first_v6(c)->addr.a6)) { + } else if (!count_v6(c)) { c->no_dhcpv6 = 1; }
diff --git a/pasta.c b/pasta.c index de0ba14..8cb5873 100644 --- a/pasta.c +++ b/pasta.c @@ -312,10 +312,14 @@ static void pasta_ns_conf_ip4(struct ctx *c) int rc = 0;
if (c->ip4.no_copy_addrs) { - struct inany_addr_entry *e = first_v4(c); + const struct inany_addr_entry *e;
- rc = nl_addr_set(nl_sock_ns, c->pasta_ifi, AF_INET, - inany_v4(&e->addr), e->prefix_len - 96); + for_each_addr(c, e, AF_INET) { + rc = nl_addr_set(nl_sock_ns, c->pasta_ifi, AF_INET, + inany_v4(&e->addr), e->prefix_len - 96); + if (rc < 0) + break; + }
As noted on the previous patch, I think a single pass through the array makes sense, only changing the inner part of the loop for v4 vs. v6 (we could potentially update nl_addr_set() to take an inany, which might make it cleaner).
} else { rc = nl_addr_dup(nl_sock, c->ifi4, nl_sock_ns, c->pasta_ifi, AF_INET); @@ -346,7 +350,6 @@ static void pasta_ns_conf_ip4(struct ctx *c) */ static void pasta_ns_conf_ip6(struct ctx *c) { - struct inany_addr_entry *e; int rc = 0;
rc = nl_addr_get_ll(nl_sock_ns, c->pasta_ifi, &c->ip6.addr_ll_seen); @@ -365,11 +368,15 @@ static void pasta_ns_conf_ip6(struct ctx *c) nl_link_set_flags(nl_sock_ns, c->pasta_ifi, 0, IFF_NOARP);
if (c->ip6.no_copy_addrs) { - e = first_v6(c); + const struct inany_addr_entry *e;
- if (e && !IN6_IS_ADDR_UNSPECIFIED(&e->addr.a6)) { + for_each_addr(c, e, AF_INET6) { + if (IN6_IS_ADDR_UNSPECIFIED(&e->addr.a6)) + continue; rc = nl_addr_set(nl_sock_ns, c->pasta_ifi, - AF_INET6, &e->addr.a6, 64); + AF_INET6, &e->addr.a6, e->prefix_len); + if (rc < 0) + break; } } else { rc = nl_addr_dup(nl_sock, c->ifi6, -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Fri, Jan 30, 2026 at 04:44:41PM -0500, Jon Maloy wrote:
As a preparation for handling multiple addresses, we update ignore_arp() to check against all addresses in the unified addrs[] array using the for_each_addr() macro.
Signed-off-by: Jon Maloy
Reviewed-by: David Gibson
--- v3: Adapted to single-array changes earlier in this series --- arp.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arp.c b/arp.c index f16beac..c493f9e 100644 --- a/arp.c +++ b/arp.c @@ -41,6 +41,8 @@ static bool ignore_arp(const struct ctx *c, const struct arphdr *ah, const struct arpmsg *am) { + const struct inany_addr_entry *e; + if (ah->ar_hrd != htons(ARPHRD_ETHER) || ah->ar_pro != htons(ETH_P_IP) || ah->ar_hln != ETH_ALEN || @@ -53,10 +55,10 @@ static bool ignore_arp(const struct ctx *c, !memcmp(am->sip, am->tip, sizeof(am->sip))) return true;
- /* Don't resolve the guest's assigned address, either. */ - if (first_v4(c) && - !memcmp(am->tip, inany_v4(&first_v4(c)->addr), sizeof(am->tip))) - return true; + /* Don't resolve any of the guest's addresses */ + for_each_addr(c, e, AF_INET) + if (!memcmp(am->tip, inany_v4(&e->addr), sizeof(am->tip))) + return true;
return false; } -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Fri, Jan 30, 2026 at 04:44:44PM -0500, Jon Maloy wrote:
Some migration address structures and functions have a _v1 suffix. This is confusing, since they are currently handling version 2 of the migration protocol. We are now going to introduce a new version 3 of the protocol, so we choose to give these functions the correct suffix _v2 instead. This is in correspondence with current reality, and will help make a clearer distinction between the old and the new versions of those functions.
I think the historical reason for this is that those specific
components didn't change format from v1 to v2. But, given that v1
existed only very briefly and is not now supported, that's not very
relevant.
So,
Reviewed-by: David Gibson
Signed-off-by: Jon Maloy
--- migrate.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/migrate.c b/migrate.c index 48d63a0..7398d26 100644 --- a/migrate.c +++ b/migrate.c @@ -29,13 +29,13 @@ #define MIGRATE_MAGIC 0xB1BB1D1B0BB1D1B0
/** - * struct migrate_seen_addrs_v1 - Migratable guest addresses for v1 state stream + * struct migrate_seen_addrs_v2 - Migratable guest addresses for v2 protocol * @addr6: Observed guest IPv6 address * @addr6_ll: Observed guest IPv6 link-local address * @addr4: Observed guest IPv4 address * @mac: Observed guest MAC address */ -struct migrate_seen_addrs_v1 { +struct migrate_seen_addrs_v2 { struct in6_addr addr6; struct in6_addr addr6_ll; struct in_addr addr4; @@ -43,7 +43,7 @@ struct migrate_seen_addrs_v1 { } __attribute__((packed));
/** - * seen_addrs_source_v1() - Copy and send guest observed addresses from source + * seen_addrs_source_v2() - Copy and send guest observed addresses from source * @c: Execution context * @stage: Migration stage, unused * @fd: File descriptor for state transfer @@ -51,10 +51,10 @@ struct migrate_seen_addrs_v1 { * Return: 0 on success, positive error code on failure */ /* cppcheck-suppress [constParameterCallback, unmatchedSuppression] */ -static int seen_addrs_source_v1(struct ctx *c, +static int seen_addrs_source_v2(struct ctx *c, const struct migrate_stage *stage, int fd) { - struct migrate_seen_addrs_v1 addrs = { + struct migrate_seen_addrs_v2 addrs = { .addr6 = c->ip6.addr_seen, .addr6_ll = c->ip6.addr_ll_seen, .addr4 = c->ip4.addr_seen, @@ -71,17 +71,17 @@ static int seen_addrs_source_v1(struct ctx *c, }
/** - * seen_addrs_target_v1() - Receive and use guest observed addresses on target + * seen_addrs_target_v2() - Receive and use guest observed addresses on target * @c: Execution context * @stage: Migration stage, unused * @fd: File descriptor for state transfer * * Return: 0 on success, positive error code on failure */ -static int seen_addrs_target_v1(struct ctx *c, +static int seen_addrs_target_v2(struct ctx *c, const struct migrate_stage *stage, int fd) { - struct migrate_seen_addrs_v1 addrs; + struct migrate_seen_addrs_v2 addrs;
(void)stage;
@@ -100,8 +100,8 @@ static int seen_addrs_target_v1(struct ctx *c, static const struct migrate_stage stages_v2[] = { { .name = "observed addresses", - .source = seen_addrs_source_v1, - .target = seen_addrs_target_v1, + .source = seen_addrs_source_v2, + .target = seen_addrs_target_v2, }, { .name = "prepare flows", -- 2.52.0
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
participants (2)
-
David Gibson
-
Jon Maloy