[PATCH 0/7] Improved selection of external interface and IP configuration
The current semantics for selecting an external interface are quite confusing - depending on details it can pick either the interface associated with the first default route, or the lowest numbered interface with a default route, which might not be the same.The logic for checking the interface in the tests isn't quite identical which can lead to test failures when there are multiple external routes. This series fixes that bug and makes a number of follow on clean ups to the detection / configuration of IP parameters from the host. David Gibson (7): Allow different external interfaces for IPv4 and IPv6 connectivity Separately locate external interfaces for IPv4 and IPv6 Initialize host side MAC when in IPv6 only mode Move passt mac_guest init to be more symmetric with pasta Clarify semantics of c->v4 and c->v6 variables Separate IPv4 and IPv6 configuration Make substructures for IPv4 and IPv6 specific context information arp.c | 2 +- conf.c | 326 ++++++++++++++++++++++-------------------- dhcp.c | 22 +-- dhcpv6.c | 18 +-- ndp.c | 16 +-- netlink.c | 79 +--------- netlink.h | 2 +- passt.c | 6 +- passt.h | 78 +++++----- pasta.c | 14 +- tap.c | 32 +++-- tcp.c | 56 ++++---- test/dhcp/passt | 3 +- test/dhcp/pasta | 3 +- test/ndp/passt | 4 +- test/two_guests/basic | 3 +- udp.c | 70 ++++----- util.c | 6 +- util.h | 6 - 19 files changed, 357 insertions(+), 389 deletions(-) -- 2.37.1
It's quite plausible for a host to have both IPv4 and IPv6 connectivity,
but only via different interfaces. For example, this will happen in the
case that IPv6 connectivity is via a tunnel (e.g. 6in4 or 6rd). It would
also happen in the case that IPv4 access is via a tunnel on an otherwise
IPv6 only local network, which is a setup that might become more common in
the post IPv4 address exhaustion world.
In turns out there's no real need for passt/pasta to get its IPv4 and IPv6
connectivity via the same interface, so we can handle this situation fairly
easily. Change the core to allow eparate external interfaces for IPv4 and
IPv6. We don't actually set these separately for now.
Signed-off-by: David Gibson
Now that the back end allows passt/pasta to use different external
interfaces for IPv4 and IPv6, use that to do the right thing in the case
that the host has IPv4 and IPv6 connectivity via different interfaces.
If the user hasn't explicitly chosen an interface, separately search for
a suitable external interface for each protocol.
As a bonus, this substantially simplifies the external interface probe. It
also eliminates a subtle confusing case where in some circumstances we
would pick the first interface in interface index order, and sometimes in
order of routes returned from netlink. On some network configurations that
could cause tests to fail, because the logic in the tests was subtly
different (it always used route order).
Signed-off-by: David Gibson
On Fri, 22 Jul 2022 15:31:13 +1000
David Gibson
Now that the back end allows passt/pasta to use different external interfaces for IPv4 and IPv6, use that to do the right thing in the case that the host has IPv4 and IPv6 connectivity via different interfaces. If the user hasn't explicitly chosen an interface, separately search for a suitable external interface for each protocol.
As a bonus, this substantially simplifies the external interface probe. It also eliminates a subtle confusing case where in some circumstances we would pick the first interface in interface index order, and sometimes in order of routes returned from netlink. On some network configurations that could cause tests to fail, because the logic in the tests was subtly different (it always used route order).
Signed-off-by: David Gibson
--- conf.c | 19 +++++++++-- netlink.c | 79 ++++--------------------------------------- netlink.h | 2 +- test/dhcp/passt | 3 +- test/dhcp/pasta | 3 +- test/ndp/passt | 4 +-- test/two_guests/basic | 3 +- 7 files changed, 33 insertions(+), 80 deletions(-) [...]
diff --git a/test/dhcp/passt b/test/dhcp/passt index f45227a..11e0eb3 100644 --- a/test/dhcp/passt +++ b/test/dhcp/passt @@ -17,6 +17,7 @@ htools ip jq sed tr head test Interface name gout IFNAME ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname' hout HOST_IFNAME ip -j -4 route show|jq -rM '[.[] | select(.dst == "default").dev] | .[0]' +hout HOST_IFNAME6 ip -j -6 route show|jq -rM '[.[] | select(.dst == "default").dev] | .[0]' check [ -n "__IFNAME__" ]
test DHCP: address @@ -49,7 +50,7 @@ check [ "__SEARCH__" = "__HOST_SEARCH__" ] test DHCPv6: address guest /sbin/dhclient -6 __IFNAME__ gout ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME__").addr_info[] | select(.prefixlen == 128).local' -hout HOST_ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__HOST_IFNAME__").addr_info[] | select(.scope == "global").local' +hout HOST_ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__HOST_IFNAME6__").addr_info[] | select(.scope == "global").local' check [ "__ADDR6__" = "__HOST_ADDR6__" ]
test DHCPv6: route
This,
[...]
diff --git a/test/two_guests/basic b/test/two_guests/basic index f7c016d..e226178 100644 --- a/test/two_guests/basic +++ b/test/two_guests/basic @@ -19,6 +19,7 @@ test Interface names g1out IFNAME1 ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname' g2out IFNAME2 ip -j link show | jq -rM '.[] | select(.link_type == "ether").ifname' hout HOST_IFNAME ip -j -4 route show|jq -rM '[.[] | select(.dst == "default").dev] | .[0]' +hout HOST_IFNAME6 ip -j -6 route show|jq -rM '[.[] | select(.dst == "default").dev] | .[0]' check [ -n "__IFNAME1__" ] check [ -n "__IFNAME2__" ]
@@ -40,7 +41,7 @@ guest1 /sbin/dhclient -6 __IFNAME1__ guest2 /sbin/dhclient -6 __IFNAME2__ g1out ADDR1_6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME1__").addr_info[] | select(.prefixlen == 128).local' g2out ADDR2_6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__IFNAME2__").addr_info[] | select(.prefixlen == 128).local' -hout HOST_ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__HOST_IFNAME__").addr_info[] | select(.scope == "global").local' +hout HOST_ADDR6 ip -j -6 addr show|jq -rM '.[] | select(.ifname == "__HOST_IFNAME6__").addr_info[] | select(.scope == "global").local' check [ "__ADDR1_6__" = "__HOST_ADDR6__" ] check [ "__ADDR2_6__" = "__HOST_ADDR6__" ]
and this are based on patch 17/18 from your previous series ("tests: Correct determination of host interface name in tests"). I went ahead and applied that patch (which I skipped previously) before this one, anyway it makes sense now. -- Stefano
When sending packets to the guest we need a source MAC address, which we
currently take from the host side interface we're using (though it's
basically arbitrary). However if not given on the command line this MAC
is initialized in an IPv4 specific path, and will end up as
00:00:00:00:00:00 when running "passt 6". The MAC address is also used
for IPv6 packets, though.
Interestingly, we largely seem to get away with using an all-zero MAC, but
it's probably not a good idea. Make the IPv6 path pick the MAC address
from its interface if the IPv4 path hasn't already done so.
While we're there, use the existing MAC_IS_ZERO macro to make the code a
little clearer.
Signed-off-by: David Gibson
In pasta mode, the guest's MAC address is set up in pasta_ns_cobf() called
from tap_sock_tun_init(). If we have a guest MAC configured with
--ns-mac-addr, this will set the given MAC on the kernel tuntap device, or
if we haven't configured one it will update our record of the guest MAC to
the kernel assigned one from the device.
For passt, we don't initially know the guest's MAC until we receive packets
from it, so we have to initially use a broadcast address. This is - oddly
- set up in an entirely different place, in conf_ip() conditional on the
mode.
Move it to the logically matching place for passt - tap_sock_unix_init().
Signed-off-by: David Gibson
The v4 and v6 fields of the context structure can be confusing, because
they change meaning part way through the code: Before conf_ip(), they are
booleans which indicate whether the -4 or -6 options have been given.
After conf_ip() they are DISABLED|ENABLED|PROBE enums which indicate
whether the IP version is available (which means both that it was allowed
on the command line and we were able to configure it). The PROBE variant
of the enum is only used locally within conf_ip() and since recent changes
there it no longer has a real purpose different from ENABLED.
Simplify this all by making the context fields always just a boolean
indicating the availability of the IP version. They both default to 1, but
can be set to 0 by either command line options or configuration failures.
We use some local variables in conf() for tracking the state of the command
line options on their own.
Signed-off-by: David Gibson
On Fri, 22 Jul 2022 15:31:16 +1000
David Gibson
The v4 and v6 fields of the context structure can be confusing, because they change meaning part way through the code: Before conf_ip(), they are booleans which indicate whether the -4 or -6 options have been given. After conf_ip() they are DISABLED|ENABLED|PROBE enums which indicate whether the IP version is available (which means both that it was allowed on the command line and we were able to configure it). The PROBE variant of the enum is only used locally within conf_ip() and since recent changes there it no longer has a real purpose different from ENABLED.
Simplify this all by making the context fields always just a boolean indicating the availability of the IP version. They both default to 1, but can be set to 0 by either command line options or configuration failures. We use some local variables in conf() for tracking the state of the command line options on their own.
Signed-off-by: David Gibson
--- conf.c | 59 ++++++++++++++++++++-------------------------------------- util.h | 6 ------ 2 files changed, 20 insertions(+), 45 deletions(-) diff --git a/conf.c b/conf.c index f5b761f..63f0f3d 100644 --- a/conf.c +++ b/conf.c @@ -26,6 +26,7 @@ #include
#include #include +#include #include #include #include @@ -615,40 +616,25 @@ static int conf_ns_opt(struct ctx *c, */ static void conf_ip(struct ctx *c) { - int v4, v6; - if (c->v4) { - c->v4 = IP_VERSION_ENABLED; - v4 = IP_VERSION_PROBE; - v6 = c->v6 = IP_VERSION_DISABLED; - } else if (c->v6) { - c->v6 = IP_VERSION_ENABLED; - v6 = IP_VERSION_PROBE; - v4 = c->v4 = IP_VERSION_DISABLED; - } else { - c->v4 = c->v6 = IP_VERSION_ENABLED; - v4 = v6 = IP_VERSION_PROBE; - } - - if (v4 != IP_VERSION_DISABLED) { if (!c->ifi4) c->ifi4 = nl_get_ext_if(AF_INET); if (!c->ifi4) { warn("No external routable interface for IPv4"); - v4 = IP_VERSION_DISABLED; + c->v4 = 0; } } - if (v6 != IP_VERSION_DISABLED) { + if (c->v6) { if (!c->ifi6) c->ifi6 = nl_get_ext_if(AF_INET6); if (!c->ifi6) { warn("No external routable interface for IPv6"); - v6 = IP_VERSION_DISABLED; + c->v6 = 0; } }
- if (v4 != IP_VERSION_DISABLED) { + if (c->v4) { if (!c->gw4) nl_route(0, c->ifi4, AF_INET, &c->gw4);
@@ -676,7 +662,7 @@ static void conf_ip(struct ctx *c) nl_link(0, c->ifi4, c->mac, 0, 0); }
- if (v6 != IP_VERSION_DISABLED) { + if (c->v6) { int prefix_len = 0;
if (IN6_IS_ADDR_UNSPECIFIED(&c->gw6)) @@ -694,25 +680,18 @@ static void conf_ip(struct ctx *c) }
if (!c->gw4 || !c->addr4 || MAC_IS_ZERO(c->mac)) - v4 = IP_VERSION_DISABLED; - else - v4 = IP_VERSION_ENABLED; + c->v4 = 0;
if (IN6_IS_ADDR_UNSPECIFIED(&c->gw6) || IN6_IS_ADDR_UNSPECIFIED(&c->addr6) || IN6_IS_ADDR_UNSPECIFIED(&c->addr6_ll) || MAC_IS_ZERO(c->mac)) - v6 = IP_VERSION_DISABLED; - else - v6 = IP_VERSION_ENABLED; + c->v6 = 0;
- if ((v4 == IP_VERSION_DISABLED) && (v6 == IP_VERSION_DISABLED)) { + if (!c->v4 && !c->v6) { err("External interface not usable"); exit(EXIT_FAILURE); } - - c->v4 = v4; - c->v6 = v6; }
/** @@ -1054,8 +1033,8 @@ void conf(struct ctx *c, int argc, char **argv) {"no-ndp", no_argument, &c->no_ndp, 1 }, {"no-ra", no_argument, &c->no_ra, 1 }, {"no-map-gw", no_argument, &c->no_map_gw, 1 }, - {"ipv4-only", no_argument, &c->v4, '4' }, - {"ipv6-only", no_argument, &c->v6, '6' }, + {"ipv4-only", no_argument, NULL, '4' }, + {"ipv6-only", no_argument, NULL, '6' }, {"tcp-ports", required_argument, NULL, 't' }, {"udp-ports", required_argument, NULL, 'u' }, {"tcp-ns", required_argument, NULL, 'T' }, @@ -1083,6 +1062,7 @@ void conf(struct ctx *c, int argc, char **argv) struct in6_addr *dns6 = c->dns6; int name, ret, mask, b, i; uint32_t *dns4 = c->dns4; + bool v4_only = false, v6_only = false;
Moved before declaration of *dnss on merge, for coding style consistency, to keep declarations from longest to shortest (what they call reverse Christmas tree in the kernel networking area, which is pretty much the style I followed here). Rationale similar to: https://hisham.hm/2018/06/16/when-listing-repeated-things-make-pyramids/ -- Stefano
After recent changes, conf_ip() now has essentially entirely disjoint paths
for IPv4 and IPv6 configuration. So, it's cleaner to split them out into
different functions conf_ip4() and conf_ip6().
Splitting these out also lets us make the interface a bit nicer, having
them return success or failure directly, rather than manipulating c->v4
and c->v6 to indicate success/failure of the two versions.
Since these functions may also initialize the interface index for each
protocol, it turns out we can then drop c->v4 and c->v6 entirely, replacing
tests on those with tests on whether c->ifi4 or c->ifi6 is non-zero (since
a 0 interface index is never valid).
Signed-off-by: David Gibson
On merge:
On Fri, 22 Jul 2022 15:31:17 +1000
David Gibson
[...]
diff --git a/conf.c b/conf.c index 63f0f3d..2a4ef23 100644 --- a/conf.c +++ b/conf.c @@ -411,8 +411,8 @@ static void get_dns(struct ctx *c) int line_len; char *line, *p, *end;
- dns4_set = !c->v4 || !!*dns4; - dns6_set = !c->v6 || !IN6_IS_ADDR_UNSPECIFIED(dns6); + dns4_set = !c->ifi4 || !!*dns4; + dns6_set = !c->ifi6 || !IN6_IS_ADDR_UNSPECIFIED(dns6);
I dropped the extra whitespace here (the original purpose was to align things), and...
dnss_set = !!*s->n || c->no_dns_search; dns_set = (dns4_set && dns6_set) || c->no_dns;
@@ -611,87 +611,93 @@ static int conf_ns_opt(struct ctx *c, }
/** - * conf_ip() - Verify or detect IPv4/IPv6 support, get relevant addresses + * conf_ip4() - Verify or detect IPv4 support, get relevant addresses * @c: Execution context + * @ifi: Host interface to attempt (0 to determine one) + * + * Return: Interface index for IPv4, or 0 on failure. */ -static void conf_ip(struct ctx *c) +static unsigned int conf_ip4(struct ctx *c, unsigned int ifi) { - if (c->v4) { - if (!c->ifi4) - c->ifi4 = nl_get_ext_if(AF_INET); - if (!c->ifi4) { - warn("No external routable interface for IPv4"); - c->v4 = 0; - } - } + if (!ifi) + ifi = nl_get_ext_if(AF_INET);
- if (c->v6) { - if (!c->ifi6) - c->ifi6 = nl_get_ext_if(AF_INET6); - if (!c->ifi6) { - warn("No external routable interface for IPv6"); - c->v6 = 0; - } + if (!ifi) { + warn("No external routable interface for IPv4"); + return 0; }
- if (c->v4) { - if (!c->gw4) - nl_route(0, c->ifi4, AF_INET, &c->gw4); + if (!c->gw4) + nl_route(0, ifi, AF_INET, &c->gw4);
- if (!c->addr4) { - int mask_len = 0; + if (!c->addr4) { + int mask_len = 0;
- nl_addr(0, c->ifi4, AF_INET, &c->addr4, &mask_len, NULL); - c->mask4 = htonl(0xffffffff << (32 - mask_len)); - } + nl_addr(0, ifi, AF_INET, &c->addr4, &mask_len, NULL); + c->mask4 = htonl(0xffffffff << (32 - mask_len)); + }
- if (!c->mask4) { - if (IN_CLASSA(ntohl(c->addr4))) - c->mask4 = htonl(IN_CLASSA_NET); - else if (IN_CLASSB(ntohl(c->addr4))) - c->mask4 = htonl(IN_CLASSB_NET); - else if (IN_CLASSC(ntohl(c->addr4))) - c->mask4 = htonl(IN_CLASSC_NET); - else - c->mask4 = 0xffffffff; - } + if (!c->mask4) { + if (IN_CLASSA(ntohl(c->addr4))) + c->mask4 = htonl(IN_CLASSA_NET); + else if (IN_CLASSB(ntohl(c->addr4))) + c->mask4 = htonl(IN_CLASSB_NET); + else if (IN_CLASSC(ntohl(c->addr4))) + c->mask4 = htonl(IN_CLASSC_NET); + else + c->mask4 = 0xffffffff; + }
- memcpy(&c->addr4_seen, &c->addr4, sizeof(c->addr4_seen)); + memcpy(&c->addr4_seen, &c->addr4, sizeof(c->addr4_seen));
- if (MAC_IS_ZERO(c->mac)) - nl_link(0, c->ifi4, c->mac, 0, 0); - } + if (MAC_IS_ZERO(c->mac)) + nl_link(0, ifi, c->mac, 0, 0);
- if (c->v6) { - int prefix_len = 0; + if (!c->gw4 || !c->addr4 || MAC_IS_ZERO(c->mac)) + return 0;
- if (IN6_IS_ADDR_UNSPECIFIED(&c->gw6)) - nl_route(0, c->ifi6, AF_INET6, &c->gw6); + return ifi; +}
- nl_addr(0, c->ifi6, AF_INET6, - IN6_IS_ADDR_UNSPECIFIED(&c->addr6) ? &c->addr6 : NULL, - &prefix_len, &c->addr6_ll); +/** + * conf_ip6() - Verify or detect IPv6 support, get relevant addresses + * @c: Execution context + * @ifi: Host interface to attempt (0 to determine one) + * + * Return: Interface index for IPv6, or 0 on failure. + */ +static unsigned int conf_ip6(struct ctx *c, unsigned int ifi) +{ + int prefix_len = 0;
- memcpy(&c->addr6_seen, &c->addr6, sizeof(c->addr6)); - memcpy(&c->addr6_ll_seen, &c->addr6_ll, sizeof(c->addr6_ll)); + if (!ifi) + ifi = nl_get_ext_if(AF_INET6);
- if (MAC_IS_ZERO(c->mac)) - nl_link(0, c->ifi6, c->mac, 0, 0); + if (!ifi) { + warn("No external routable interface for IPv6"); + return 0; }
- if (!c->gw4 || !c->addr4 || MAC_IS_ZERO(c->mac)) - c->v4 = 0; + if (IN6_IS_ADDR_UNSPECIFIED(&c->gw6)) + nl_route(0, ifi, AF_INET6, &c->gw6); + + nl_addr(0, ifi, AF_INET6, + IN6_IS_ADDR_UNSPECIFIED(&c->addr6) ? &c->addr6 : NULL, + &prefix_len, &c->addr6_ll); + + memcpy(&c->addr6_seen, &c->addr6, sizeof(c->addr6)); + memcpy(&c->addr6_ll_seen, &c->addr6_ll, sizeof(c->addr6_ll)); + + if (MAC_IS_ZERO(c->mac)) + nl_link(0, ifi, c->mac, 0, 0);
if (IN6_IS_ADDR_UNSPECIFIED(&c->gw6) || IN6_IS_ADDR_UNSPECIFIED(&c->addr6) || IN6_IS_ADDR_UNSPECIFIED(&c->addr6_ll) || MAC_IS_ZERO(c->mac)) - c->v6 = 0; + return 0;
- if (!c->v4 && !c->v6) { - err("External interface not usable"); - exit(EXIT_FAILURE); - } + return ifi; }
/** @@ -889,7 +895,7 @@ static void conf_print(const struct ctx *c) c->mac[0], c->mac[1], c->mac[2], c->mac[3], c->mac[4], c->mac[5]);
- if (c->v4) { + if (c->ifi4) { if (!c->no_dhcp) { info("DHCP:"); info(" assign: %s", @@ -914,7 +920,7 @@ static void conf_print(const struct ctx *c) } }
- if (c->v6) { + if (c->ifi6) { char buf6[INET6_ADDRSTRLEN];
if (!c->no_ndp && !c->no_dhcpv6) @@ -1063,6 +1069,7 @@ void conf(struct ctx *c, int argc, char **argv) int name, ret, mask, b, i; uint32_t *dns4 = c->dns4; bool v4_only = false, v6_only = false; + unsigned int ifi = 0;
if (c->mode == MODE_PASTA) c->no_dhcp_dns = c->no_dhcp_dns_search = 1; @@ -1390,12 +1397,12 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); break; case 'i': - if (c->ifi4 || c->ifi6) { + if (ifi) { err("Redundant interface: %s", optarg); usage(argv[0]); }
- if (!(c->ifi4 = c->ifi6 = if_nametoindex(optarg))) { + if (!(ifi = if_nametoindex(optarg))) { err("Invalid interface name %s: %s", optarg, strerror(errno)); usage(argv[0]); @@ -1503,10 +1510,15 @@ void conf(struct ctx *c, int argc, char **argv) err("Options ipv4-only and ipv6-only are mutually exclusive"); usage(argv[0]); } - c->v4 = !v6_only; - c->v6 = !v4_only; - conf_ip(c); - + if (!v6_only) + c->ifi4 = conf_ip4(c, ifi); + if (!v4_only) + c->ifi6 = conf_ip6(c, ifi); + if (!c->ifi4 && !c->ifi6) { + err("External interface not usable"); + exit(EXIT_FAILURE); + } +
I dropped this tab. -- Stefano
The context structure contains a batch of fields specific to IPv4 and to
IPv6 connectivity. Split those out into a sub-structure.
This allows the conf_ip4() and conf_ip6() functions, which take the
entire context but touch very little of it, to be given more specific
parameters, making it clearer what it affects without stepping through the
code.
Signed-off-by: David Gibson
participants (2)
-
David Gibson
-
Stefano Brivio