[PATCH v2 00/23] Allow configuration of special case NATs
Based on Stefano's recent patch for faster tests. Allow the user to specify which addresses are translated when used by the guest, rather than always being the gateway address or nothing. We also allow this remapping to go to the host's global address (more precisely the address assigned to the guest) rather than just host loopback. Along the way to implementing that make many changes to clarify what various addresses we track mean, fixing a number of small bugs as well. Paul, amongst other things, I think this will allow podman to (finally) nicely address #19213, picking an address to remap to the host's external address with --map-guest-addr, much like it already uses --dns-forward. Changes in v2: * Assorted minor stylistic fixes based on Stefano's review * Change name of the new options from --nat-* to --map-* * Shorten descriptions of new options in --help (leave the full text to the man page) * Add fix for the fact that changing MTU causes IPv6 to be temporarily deconfigured during perf tests David Gibson (23): treewide: Use "our address" instead of "forwarding address" util: Helper for formatting MAC addresses treewide: Rename MAC address fields for clarity treewide: Use struct assignment instead of memcpy() for IP addresses conf: Use array indices rather than pointers for DNS array slots conf: More accurately count entries added in get_dns() conf: Move DNS array bounds checks into add_dns[46] conf: Move adding of a nameserver from resolv.conf into subfunction conf: Correct setting of dns_match address in add_dns6() conf: Treat --dns addresses as guest visible addresses conf: Remove incorrect initialisation of addr_ll_seen util: Correct sock_l4() binding for link local addresses treewide: Change misleading 'addr_ll' name Clarify which addresses in ip[46]_ctx are meaningful where Initialise our_tap_ll to ip6.gw when suitable fwd: Helpers to clarify what host addresses aren't guest accessible fwd: Split notion of "our tap address" from gateway for IPv4 Don't take "our" MAC address from the host conf, fwd: Split notion of gateway/router from guest-visible host address test: Reconfigure IPv6 address after changing MTU conf: Allow address remapped to host to be configured fwd: Distinguish translatable from untranslatable addresses on inbound fwd, conf: Allow NAT of the guest's assigned address arp.c | 4 +- conf.c | 318 +++++++++++++++++++++++++----------------- dhcp.c | 21 +-- dhcpv6.c | 21 +-- flow.c | 72 +++++----- flow.h | 18 +-- fwd.c | 170 +++++++++++++++++----- icmp.c | 4 +- ndp.c | 9 +- passt.1 | 43 +++++- passt.c | 2 +- passt.h | 53 +++++-- pasta.c | 14 +- tap.c | 12 +- tcp.c | 33 ++--- tcp_internal.h | 2 +- test/lib/setup | 11 +- test/passt_in_ns/dhcp | 73 ++++++++++ test/passt_in_ns/tcp | 38 +++-- test/passt_in_ns/udp | 22 +-- test/perf/passt_tcp | 37 ++--- test/perf/passt_udp | 31 ++-- test/perf/pasta_tcp | 29 ++-- test/perf/pasta_udp | 25 ++-- test/run | 4 +- udp.c | 12 +- util.c | 22 ++- util.h | 4 +- 28 files changed, 712 insertions(+), 392 deletions(-) create mode 100644 test/passt_in_ns/dhcp -- 2.46.0
The term "forwarding address" to indicate the local-to-passt address was
well-intentioned, but ends up being kinda confusing. As discussed on a
recent call, let's try "our" instead.
(While we're there correct an error in flow_initiate_af()s comments where
we referred to parameters by the wrong name).
Signed-off-by: David Gibson
There are a couple of places where we somewhat messily open code formatting
an Ethernet like MAC address for display. Add an eth_ntop() helper for
this.
Signed-off-by: David Gibson
c->mac isn't a great name, because it doesn't say whose mac address it is
and it's not necessarily obvious in all the contexts we use it. Since this
is specifically the address that we (passt/pasta) use on the tap interface,
rename it to "our_tap_mac". Rename the "mac_guest" field to "guest_mac"
to be grammatically consistent.
Signed-off-by: David Gibson
We rely on C11 already, so we can use clearer and more type-checkable
struct assignment instead of mempcy() for copying IP addresses around.
This exposes some "pointer could be const" warnings from cppcheck, so
address those too.
Signed-off-by: David Gibson
Currently add_dns[46]() take a somewhat awkward double pointer to the
entry in the c->ip[46].dns array to update. It turns out to be easier to
work with indices into that array instead.
This diff does add some lines, but it's comments, and will allow some
future code reductions.
Signed-off-by: David Gibson
get_dns() counts the number of guest DNS servers it adds, and gives an
error if it couldn't add any. However, this count ignores the fact that
add_dns[46]() may in some cases *not* add an entry. Use the array indices
we're already tracking to get an accurate count.
Signed-off-by: David Gibson
Every time we call add_dns[46] we need to first check if there's space in
the c->ip[46].dns array for the new entry. We might as well make that
check in add_dns[46]() itself.
In fact it looks like the calls in get_dns() had an off by one error, not
allowing the last entry of the array to be filled. So, that bug is also
fixed by the change.
Signed-off-by: David Gibson
get_dns() is already quite deeply nested, and future changes I have in
mind will add more complexity. Prepare for this by splitting out the
adding of a single nameserver to the configuration into its own function.
Signed-off-by: David Gibson
add_dns6() (but not add_dns4()) has a bug setting dns_match: it sets it to
the given address, rather than the gateway address. This is doubly wrong:
- We've just established the given address is a host loopback address
the guest can't access
- We've just set ip6.dns[] to tell the guest to use the gateway address,
so it won't use the dns_match address we're setting
Correct this to use the gateway address, like IPv4.
Signed-off-by: David Gibson
Although it's not 100% explicit in the man page, addresses given to
the --dns option are intended to be addresses as seen by the guest.
This differs from addresses taken from the host's /etc/resolv.conf,
which must be translated to guest accessible versions in some cases.
Our implementation is currently inconsistent on this: when using
--dns-forward, you must usually also give --dns with the matching address,
which is meaningful only in the guest's address view. However if you give
--dns with a loopback addres, it will be translated like a host view
address.
Move the remapping logic for DNS addresses out of add_dns4() and add_dns6()
into add_dns_resolv() so that it is only applied for host nameserver
addresses, not for nameservers given explicitly with --dns.
Signed-off-by: David Gibson
Despite the names, addr_ll_seen does not relate to addr_ll the same
way addr_seen relates to addr. addr_ll_seen is an observed address
from the guest, whereas addr_ll is *our* link-local address for use on
the tap link when we can't use an external endpoint address. It's
used both for passt provided services (DHCPv6, NDP) and in some cases
for connections from addresses the guest can't access.
Signed-off-by: David Gibson
When binding an IPv6 socket in sock_l4() we need to supply a scope id
if the address is link-local. We check for this by comparing the
given address to c->ip6.addr_ll. This is correct only by accident:
while c->ip6.addr_ll is typically set to the host interface's link
local address, the actual purpose of it is to provide a link local
address for passt's private use on the tap interface.
Instead set the scope id for any link-local address we're binding to.
We're going to need something and this is what makes sense for sockets
on the host. It doesn't make sense for PIF_SPLICE sockets, but those
should always have loopback, not link-local addresses.
Signed-off-by: David Gibson
c->ip6.addr_ll is not like c->ip6.addr. The latter is an address for the
guest, but the former is an address for our use on the tap link. Rename it
accordingly, to 'our_tap_ll'.
Signed-off-by: David Gibson
Some are guest visible addresses and may not be valid on the host, others
are host visible addresses and may not be valid on the guest. Rearrange
and comment the ip[46]_ctx definitions to make it clearer which is which.
Signed-off-by: David Gibson
In every place we use our_tap_ll, we only use it as a fallback if the
IPv6 gateway address is not link-local. We can avoid that conditional at
use time by doing it at initialisation of our_tap_ll instead.
Signed-off-by: David Gibson
We usually avoid NAT, but in a few cases we need to apply address
translations. For inbound connections that happens for addresses which
make sense to the host but are either inaccessible, or mean a different
location from the guest's point of view.
Add some helper functions to determine such addresses, and use them in
fwd_nat_from_host(). In doing so clarify some of the reasons for the
logic. We'll also have further use for these helpers in future.
While we're there fix one unneccessary inconsistency between IPv4 and IPv6.
We always translated the guest's observed address, but for IPv4 we didn't
translate the guest's assigned address, whereas for IPv6 we did. Change
this to translate both in all cases for consistency.
Signed-off-by: David Gibson
ip4.gw conflates 3 conceptually different things, which (for now) have the
same value:
1. The router/gateway address as seen by the guest
2. An address to NAT to the host with --no-map-gw isn't specified
3. An address to use as source when nothing else makes sense
Case 3 occurs in two situations:
a) for our DHCP responses - since they come from passt internally there's
no naturally meaningful address for them to come from
b) for forwarded connections coming from an address that isn't guest
accessible (localhost or the guest's own address).
(b) occurs even with --no-map-gw, and the expected behaviour of forwarding
local connections requires it.
For IPv6 role (3) is now taken by ip6.our_tap_ll (which usually has the
same value as ip6.gw). For future flexibility we may want to make this
"address of last resort" different from the gateway address, so split them
logically for IPv4 as well.
Specifically, add a new ip4.our_tap_addr field for the address with this
role, and initialise it to ip4.gw for now. Unlike IPv6 where we can always
get a link-local address, we might not be able to get a (non 0.0.0.0)
address here (e.g. if the host is disconnected or only has a point to point
link with no gateway address). In that case we have to disable forwarding
of inbound connections with guest-inaccessible source addresses.
Signed-off-by: David Gibson
When sending frames to the guest over the tap link, we need a source MAC
address. Currently we take that from the MAC address of the main interface
on the host, but that doesn't actually make much sense:
* We can't preserve the real MAC address of packets from anywhere
external so there's no transparency case here
* In fact, it's confusingly different from how we handle IP addresses:
whereas we give the guest the same IP as the host, we're making the
host's MAC the one MAC that the guest *can't* use for itself.
* We already need a fallback case if the host doesn't have an Ethernet
like MAC (e.g. if it's connected via a point to point interface, such
as a wireguard VPN).
Change to just just use an arbitrary fixed MAC address - I've picked
9a:55:9a:55:9a:55. It's simpler and has the small advantage of making
the fact that passt/pasta is in use typically obvious from guest side
packet dumps. This can still, of course, be overridden with the -M option.
Signed-off-by: David Gibson
The @gw fields in the ip4_ctx and ip6_ctx give the (host's) default
gateway. We use this for two quite distinct things: advertising the
gateway that the guest should use (via DHCP, NDP and/or --config-net)
and for a limited form of NAT. So that the guest can access services
on the host, we map the gateway address within the guest to the
loopback address on the host.
Using the gateway address for this isn't necessarily the best choice
for this purpose, certainly not for all circumstances. So, start off
by splitting the notion of these into two different values: @guest_gw
which is the gateway address the guest should use and @nat_host_loopback,
which is the guest visible address to remap to the host's loopback.
Usually nat_host_loopback will have the same value as guest_gw. However
when --no-map-gw is specified we leave them unspecified instead. This
means when we use nat_host_loopback, we don't need to separately check
c->no_map_gw to see if it's relevant.
Signed-off-by: David Gibson
In the TCP throughput tests, we adjust the guest's MTU in order to test
various packet sizes. Some of those are below 1280 which causes IPv6 to
be deconfigured on the guest interface. When we increase it above 1280
again, IPv6 is re-enabled and we get an address in the right prefix with
NDP, but we don't get exactly the expected address back - that's only
communicated with --config-net or DHCPv6.
With changes to how we handle NAT this can cause some of the IPv6 tests to
fail, because they don't use the address that passt/pasta expects, and the
guest doesn't initiate any traffic which allows us to learn what the new
address is.
Work around this by re-invoking dhclient -6 between adjusting the MTU and
running IPv6 test cases.
Signed-off-by: David Gibson
Because the host and guest share the same IP address with passt/pasta, it's
not possible for the guest to directly address the host. Therefore we
allow packets from the guest going to a special "NAT to host" address to be
redirected to the host, appearing there as though they have both source and
destination address of loopback.
Currently that special address is always the address of the default
gateway (or none). That can be a problem if we want that gateway to be
addressable by the guest. Therefore, allow the special "NAT to host"
address to be overridden on the command line with a new --map-host-loopback
option.
In order to exercise and test it, update the passt_in_ns and perf
tests to use this option and give different mapping addresses for the
two layers of the environment.
Signed-off-by: David Gibson
fwd_nat_from_host() needs to adjust the source address for new flows coming
from an address which is not accessible to the guest. Currently we always
use our_tap_addr or our_tap_ll. However in cases where the address is
accessible to the guest via translation (i.e. via --map-host-loopback) then
it makes more sense to use that translation, rather than the fallback
mapping of our_tap_*.
Signed-off-by: David Gibson
The guest is usually assigned one of the host's IP addresses. That means
it can't access the host itself via its usual address. The
--map-host-loopback option (enabled by default with the gateway address)
allows the guest to contact the host. However, connections forwarded this
way appear on the host to have originated from the loopback interface,
which isn't always desirable.
Add a new --map-guest-addr option, which acts similarly but forwarded
connections will go to the host's external address, instead of loopback.
If '-a' is used, so the guest's address is not the same as the host's, this
will instead forward to whatever host-visible site is shadowed by the
guest's assigned address.
Signed-off-by: David Gibson
On Wed, 21 Aug 2024 14:19:56 +1000
David Gibson
Based on Stefano's recent patch for faster tests.
Allow the user to specify which addresses are translated when used by the guest, rather than always being the gateway address or nothing. We also allow this remapping to go to the host's global address (more precisely the address assigned to the guest) rather than just host loopback.
Along the way to implementing that make many changes to clarify what various addresses we track mean, fixing a number of small bugs as well.
Paul, amongst other things, I think this will allow podman to (finally) nicely address #19213, picking an address to remap to the host's external address with --map-guest-addr, much like it already uses --dns-forward.
Changes in v2: * Assorted minor stylistic fixes based on Stefano's review * Change name of the new options from --nat-* to --map-* * Shorten descriptions of new options in --help (leave the full text to the man page) * Add fix for the fact that changing MTU causes IPv6 to be temporarily deconfigured during perf tests
Applied! -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio