These patches work around or fix some issues found while testing the Podman integration for pasta in Google Compute Engine environments. v3: Fix up botched rebase, especially in 2/3, further reword 2/3, fix up 1/3 to match the recent patch from David that stores IPv4 netmasks as prefix lengths instead of masks v2: Rephrase commit messages of 1/3 and 2/3 adding examples with addresses. In 2/3, set host addresses (dns4 and dns6) and increase the pointer regardless of what we're doing with dns4_send and dns6_send. Stefano Brivio (3): conf: Adjust netmask on mismatch between IPv4 address/netmask and gateway conf: Split the notions of read DNS addresses and offered ones udp: Check for answers to forwarded DNS queries before handling local redirects conf.c | 74 ++++++++++++++++++++++++++++++++++++++++++++------------ dhcp.c | 4 +-- dhcpv6.c | 5 ++-- ndp.c | 6 ++--- passt.h | 8 ++++-- udp.c | 22 ++++++++--------- 6 files changed, 84 insertions(+), 35 deletions(-) -- 2.35.1
Seen in a Google Compute Engine environment with a machine configured via cloud-init-dhcp, while testing Podman integration for pasta: the assigned address has a /32 netmask, and there's a default route, which can be added on the host because there's another route, also /32, pointing to the default gateway. For example, on the host: ip -4 address add 10.156.0.2/32 dev eth0 ip -4 route add 10.156.0.1/32 dev eth0 ip -4 route add default via 10.156.0.1 This is not a valid configuration as far as I can tell: if the address is configured as /32, it shouldn't be used to reach a gateway outside its derived netmask. However, Linux allows that, and everything works. The problem comes when pasta --config-net sources address and default route from the host, and it can't configure the route in the target namespace because the gateway is invalid. That is, we would skip configuring the first route in the example, which results in the equivalent of doing: ip -4 address add 10.156.0.2/32 dev eth0 ip -4 route add default via 10.156.0.1 where, at this point, 10.156.0.1 is unreachable, and hence invalid as a gateway. Sourcing more routes than just the default is doable, but probably undesirable: pasta users want to provide connectivity to a container, not reflect exactly whatever trickery is configured on the host. Add a consistency check and an adjustment: if the configured default gateway is not reachable, shrink the given netmask until we can reach it. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/conf.c b/conf.c index f5099a7..b51effc 100644 --- a/conf.c +++ b/conf.c @@ -558,6 +558,9 @@ static int conf_ip4_prefix(const char *arg) static unsigned int conf_ip4(unsigned int ifi, struct ip4_ctx *ip4, unsigned char *mac) { + in_addr_t addr, gw; + int shift; + if (!ifi) ifi = nl_get_ext_if(AF_INET); @@ -572,8 +575,10 @@ static unsigned int conf_ip4(unsigned int ifi, if (IN4_IS_ADDR_UNSPECIFIED(&ip4->addr)) nl_addr(0, ifi, AF_INET, &ip4->addr, &ip4->prefix_len, NULL); + addr = ntohl(ip4->addr.s_addr); + gw = ntohl(ip4->gw.s_addr); + if (!ip4->prefix_len) { - in_addr_t addr = ntohl(ip4->addr.s_addr); if (IN_CLASSA(addr)) ip4->prefix_len = (32 - IN_CLASSA_NSHIFT); else if (IN_CLASSB(addr)) @@ -584,6 +589,24 @@ static unsigned int conf_ip4(unsigned int ifi, ip4->prefix_len = 32; } + /* We might get an address with a netmask that makes the default + * gateway unreachable, and in that case we would fail to configure + * the default route, with --config-net, or presumably a DHCP client + * in the guest or container would face the same issue. + * + * The host might have another route, to the default gateway itself, + * fixing the situation, but we only read default routes. + * + * Fix up the mask to allow reaching the default gateway from our + * configured address, if needed, and only if we find a non-zero + * mask that makes the gateway reachable. + */ + shift = 32 - ip4->prefix_len; + while (shift < 32 && addr >> shift != gw >> shift) + shift++; + if (shift < 32) + ip4->prefix_len = 32 - shift; + memcpy(&ip4->addr_seen, &ip4->addr, sizeof(ip4->addr_seen)); if (MAC_IS_ZERO(mac)) -- 2.35.1
With --dns-forward, if the host has a loopback address configured as DNS server, we should actually use it to forward queries, but, if --no-map-gw is passed, we shouldn't offer the same address via DHCP, NDP and DHCPv6, because it's not going to be reachable. Problematic configuration: * systemd-resolved configuring the usual 127.0.0.53 on the host: we read that from /etc/resolv.conf * --dns-forward specified with an unrelated address, for example 198.51.100.1 We still want to forward queries to 127.0.0.53, if we receive one directed to 198.51.100.1, so we can't drop 127.0.0.53 from our list: we want to use it for forwarding. At the same time, we shouldn't offer 127.0.0.53 to the guest or container either. With this change, I'm only covering the case of automatically configured DNS servers from /etc/resolv.conf. We could extend this to addresses configured with command-line options, but I don't really see a likely use case at this point. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 49 +++++++++++++++++++++++++++++++++++-------------- dhcp.c | 4 ++-- dhcpv6.c | 5 +++-- ndp.c | 6 +++--- passt.h | 8 ++++++-- 5 files changed, 49 insertions(+), 23 deletions(-) diff --git a/conf.c b/conf.c index b51effc..bfecdff 100644 --- a/conf.c +++ b/conf.c @@ -355,10 +355,12 @@ overlap: */ static void get_dns(struct ctx *c) { + struct in6_addr *dns6_send = &c->ip6.dns_send[0]; + struct in_addr *dns4_send = &c->ip4.dns_send[0]; int dns4_set, dns6_set, dnss_set, dns_set, fd; struct in6_addr *dns6 = &c->ip6.dns[0]; - struct fqdn *s = c->dns_search; struct in_addr *dns4 = &c->ip4.dns[0]; + struct fqdn *s = c->dns_search; struct lineread resolvconf; int line_len; char *line, *p, *end; @@ -388,31 +390,47 @@ static void get_dns(struct ctx *c) if (!dns4_set && dns4 - &c->ip4.dns[0] < ARRAY_SIZE(c->ip4.dns) - 1 && inet_pton(AF_INET, p + 1, dns4)) { - /* We can only access local addresses via the gw redirect */ + /* Guest or container can only access local + * addresses via local redirect + */ if (IN4_IS_ADDR_LOOPBACK(dns4)) { - if (c->no_map_gw) { - dns4->s_addr = htonl(INADDR_ANY); - continue; + if (!c->no_map_gw) { + *dns4_send = c->ip4.gw; + dns4_send++; } - *dns4 = c->ip4.gw; + } else { + *dns4_send = *dns4; + dns4_send++; } + dns4++; + dns4->s_addr = htonl(INADDR_ANY); + dns4_send->s_addr = htonl(INADDR_ANY); } if (!dns6_set && dns6 - &c->ip6.dns[0] < ARRAY_SIZE(c->ip6.dns) - 1 && inet_pton(AF_INET6, p + 1, dns6)) { - /* We can only access local addresses via the gw redirect */ + /* Guest or container can only access local + * addresses via local redirect + */ if (IN6_IS_ADDR_LOOPBACK(dns6)) { - if (c->no_map_gw) { - memset(dns6, 0, sizeof(*dns6)); - continue; + if (!c->no_map_gw) { + memcpy(dns6_send, &c->ip6.gw, + sizeof(*dns6_send)); + dns6_send++; } - memcpy(dns6, &c->ip6.gw, sizeof(*dns6)); + } else { + memcpy(dns6_send, dns6, + sizeof(*dns6_send)); + dns6_send++; } + dns6++; + memset(dns6, 0, sizeof(*dns6)); + memset(dns6_send, 0, sizeof(*dns6_send)); } } else if (!dnss_set && strstr(line, "search ") == line && s == c->dns_search) { @@ -876,10 +894,12 @@ static void conf_print(const struct ctx *c) inet_ntop(AF_INET, &c->ip4.gw, buf4, sizeof(buf4))); } - for (i = 0; !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.dns[i]); i++) { + for (i = 0; !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.dns_send[i]); + i++) { if (!i) info("DNS:"); - inet_ntop(AF_INET, &c->ip4.dns[i], buf4, sizeof(buf4)); + inet_ntop(AF_INET, &c->ip4.dns_send[i], buf4, + sizeof(buf4)); info(" %s", buf4); } @@ -910,7 +930,8 @@ static void conf_print(const struct ctx *c) inet_ntop(AF_INET6, &c->ip6.addr_ll, buf6, sizeof(buf6))); dns6: - for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[i]); i++) { + for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[i]); + i++) { if (!i) info("DNS:"); inet_ntop(AF_INET6, &c->ip6.dns[i], buf6, sizeof(buf6)); diff --git a/dhcp.c b/dhcp.c index 0c6f712..12da47a 100644 --- a/dhcp.c +++ b/dhcp.c @@ -359,9 +359,9 @@ int dhcp(const struct ctx *c, const struct pool *p) } for (i = 0, opts[6].slen = 0; - !c->no_dhcp_dns && !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.dns[i]); + !c->no_dhcp_dns && !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.dns_send[i]); i++) { - ((struct in_addr *)opts[6].s)[i] = c->ip4.dns[i]; + ((struct in_addr *)opts[6].s)[i] = c->ip4.dns_send[i]; opts[6].slen += sizeof(uint32_t); } diff --git a/dhcpv6.c b/dhcpv6.c index e763aed..67262e6 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -379,7 +379,7 @@ static size_t dhcpv6_dns_fill(const struct ctx *c, char *buf, int offset) if (c->no_dhcp_dns) goto search; - for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[i]); i++) { + for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[i]); i++) { if (!i) { srv = (struct opt_dns_servers *)(buf + offset); offset += sizeof(struct opt_hdr); @@ -387,7 +387,8 @@ static size_t dhcpv6_dns_fill(const struct ctx *c, char *buf, int offset) srv->hdr.l = 0; } - memcpy(&srv->addr[i], &c->ip6.dns[i], sizeof(srv->addr[i])); + memcpy(&srv->addr[i], &c->ip6.dns_send[i], + sizeof(srv->addr[i])); srv->hdr.l += sizeof(srv->addr[i]); offset += sizeof(srv->addr[i]); } diff --git a/ndp.c b/ndp.c index 80e1f19..6d79477 100644 --- a/ndp.c +++ b/ndp.c @@ -121,7 +121,7 @@ int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) if (c->no_dhcp_dns) goto dns_done; - for (n = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[n]); n++); + for (n = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[n]); n++); if (n) { *p++ = 25; /* RDNSS */ *p++ = 1 + 2 * n; /* length */ @@ -130,8 +130,8 @@ int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) p += 4; for (i = 0; i < n; i++) { - memcpy(p, &c->ip6.dns[i], 16); /* address */ - p += 16; + memcpy(p, &c->ip6.dns_send[i], 16); + p += 16; /* address */ } for (n = 0; *c->dns_search[n].n; n++) diff --git a/passt.h b/passt.h index 1a8d74b..e93eea8 100644 --- a/passt.h +++ b/passt.h @@ -101,7 +101,8 @@ enum passt_modes { * @addr_seen: Latest IPv4 address seen as source from tap * @prefixlen: IPv4 prefix length (netmask) * @gw: Default IPv4 gateway, network order - * @dns: IPv4 DNS addresses, zero-terminated, network order + * @dns: Host IPv4 DNS addresses, zero-terminated, network order + * @dns_send: Offered IPv4 DNS, zero-terminated, network order * @dns_fwd: Address forwarded (UDP) to first IPv4 DNS, network order */ struct ip4_ctx { @@ -110,6 +111,7 @@ struct ip4_ctx { int prefix_len; struct in_addr gw; struct in_addr dns[MAXNS + 1]; + struct in_addr dns_send[MAXNS + 1]; struct in_addr dns_fwd; }; @@ -120,7 +122,8 @@ struct ip4_ctx { * @addr_seen: Latest IPv6 global/site address seen as source from tap * @addr_ll_seen: Latest IPv6 link-local address seen as source from tap * @gw: Default IPv6 gateway - * @dns: IPv6 DNS addresses, zero-terminated + * @dns: Host IPv6 DNS addresses, zero-terminated + * @dns_send: Offered IPv6 DNS addresses, zero-terminated * @dns_fwd: Address forwarded (UDP) to first IPv6 DNS, network order */ struct ip6_ctx { @@ -130,6 +133,7 @@ struct ip6_ctx { struct in6_addr addr_ll_seen; struct in6_addr gw; struct in6_addr dns[MAXNS + 1]; + struct in6_addr dns_send[MAXNS + 1]; struct in6_addr dns_fwd; }; -- 2.35.1
Now that we allow loopback DNS addresses to be used as targets for forwarding, we need to check if DNS answers come from those targets, before deciding to eventually remap traffic for local redirects. Otherwise, the source address won't match the one configured as forwarder, which means that the guest or the container will refuse those responses. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- udp.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/udp.c b/udp.c index fca418d..42a17a7 100644 --- a/udp.c +++ b/udp.c @@ -678,9 +678,13 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, src_port = ntohs(b->s_in.sin_port); - if (IN4_IS_ADDR_LOOPBACK(&b->s_in.sin_addr) || - IN4_IS_ADDR_UNSPECIFIED(&b->s_in.sin_addr)|| - IN4_ARE_ADDR_EQUAL(&b->s_in.sin_addr, &c->ip4.addr_seen)) { + if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.dns_fwd) && + IN4_ARE_ADDR_EQUAL(&b->s_in.sin_addr, &c->ip4.dns[0]) && + src_port == 53) { + b->iph.saddr = c->ip4.dns_fwd.s_addr; + } else if (IN4_IS_ADDR_LOOPBACK(&b->s_in.sin_addr) || + IN4_IS_ADDR_UNSPECIFIED(&b->s_in.sin_addr)|| + IN4_ARE_ADDR_EQUAL(&b->s_in.sin_addr, &c->ip4.addr_seen)) { b->iph.saddr = c->ip4.gw.s_addr; udp_tap_map[V4][src_port].ts = now->tv_sec; udp_tap_map[V4][src_port].flags |= PORT_LOCAL; @@ -691,10 +695,6 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, udp_tap_map[V4][src_port].flags |= PORT_LOOPBACK; bitmap_set(udp_act[V4][UDP_ACT_TAP], src_port); - } else if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.dns_fwd) && - IN4_ARE_ADDR_EQUAL(&b->s_in.sin_addr, &c->ip4.dns[0]) && - src_port == 53) { - b->iph.saddr = c->ip4.dns_fwd.s_addr; } else { b->iph.saddr = b->s_in.sin_addr.s_addr; } @@ -770,6 +770,10 @@ static void udp_sock_fill_data_v6(const struct ctx *c, int n, if (IN6_IS_ADDR_LINKLOCAL(src)) { b->ip6h.daddr = c->ip6.addr_ll_seen; b->ip6h.saddr = b->s_in6.sin6_addr; + } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_fwd) && + IN6_ARE_ADDR_EQUAL(src, &c->ip6.dns[0]) && src_port == 53) { + b->ip6h.daddr = c->ip6.addr_seen; + b->ip6h.saddr = c->ip6.dns_fwd; } else if (IN6_IS_ADDR_LOOPBACK(src) || IN6_ARE_ADDR_EQUAL(src, &c->ip6.addr_seen) || IN6_ARE_ADDR_EQUAL(src, &c->ip6.addr)) { @@ -794,10 +798,6 @@ static void udp_sock_fill_data_v6(const struct ctx *c, int n, udp_tap_map[V6][src_port].flags &= ~PORT_GUA; bitmap_set(udp_act[V6][UDP_ACT_TAP], src_port); - } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_fwd) && - IN6_ARE_ADDR_EQUAL(src, &c->ip6.dns[0]) && src_port == 53) { - b->ip6h.daddr = c->ip6.addr_seen; - b->ip6h.saddr = c->ip6.dns_fwd; } else { b->ip6h.daddr = c->ip6.addr_seen; b->ip6h.saddr = b->s_in6.sin6_addr; -- 2.35.1