These patches work around or fix some issues found while testing the Podman integration for pasta in Google Compute Engine environments. Stefano Brivio (3): conf: Consistency check between configured IPv4 netmask and gateway conf: Split the notions of read DNS addresses and offered ones udp: Check for answers to forwarded DNS queries before handling local redirects conf.c | 54 ++++++++++++++++++++++++++++++++++++++---------------- dhcp.c | 5 +++-- dhcpv6.c | 5 +++-- ndp.c | 6 +++--- passt.h | 8 ++++++-- udp.c | 17 ++++++++--------- 6 files changed, 61 insertions(+), 34 deletions(-) -- 2.35.1
Seen in a Google Compute Engine environment with a machine configured via cloud-init-dhcp, while testing Podman integration for pasta: the assigned address has a /32 netmask, and there's a default route, which can be added on the host because there's another route, also /32, pointing to the default gateway. This is not a valid configuration as far as I can tell: if the address is configured as /32, it shouldn't be used to reach a gateway outside its derived netmask. However, Linux allows that, and everything works. The problem comes when pasta --config-net sources address and default route from the host, and it can't configure the route in the target namespace because the gateway is invalid. Sourcing more routes than just the default is doable, but probably undesirable: pasta users want to provide connectivity to a container, not reflect exactly whatever trickery is configured on the host. Add a consistency check: if the configured default gateway is not reachable, shrink the given netmask until we can reach it. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/conf.c b/conf.c index 90214f5..5b88547 100644 --- a/conf.c +++ b/conf.c @@ -562,6 +562,10 @@ static unsigned int conf_ip4(unsigned int ifi, ip4->mask = 0xffffffff; } + /* Mask consistency check: ensure we can reach the default gateway */ + while ((ip4->addr & ip4->mask) != (ip4->gw & ip4->mask)) + ip4->mask = htonl(ntohl(ip4->mask) << 1); + memcpy(&ip4->addr_seen, &ip4->addr, sizeof(ip4->addr_seen)); if (MAC_IS_ZERO(mac)) -- 2.35.1
On Thu, Nov 03, 2022 at 12:04:41AM +0100, Stefano Brivio wrote:Seen in a Google Compute Engine environment with a machine configured via cloud-init-dhcp, while testing Podman integration for pasta: the assigned address has a /32 netmask, and there's a default route, which can be added on the host because there's another route, also /32, pointing to the default gateway.I'm afraid I'm having trouble getting a good picture of the situation from this description. I think an actual example with addresses would make it much clearer.This is not a valid configuration as far as I can tell: if the address is configured as /32, it shouldn't be used to reach a gateway outside its derived netmask. However, Linux allows that, and everything works. The problem comes when pasta --config-net sources address and default route from the host, and it can't configure the route in the target namespace because the gateway is invalid. Sourcing more routes than just the default is doable, but probably undesirable: pasta users want to provide connectivity to a container, not reflect exactly whatever trickery is configured on the host. Add a consistency check: if the configured default gateway is not reachable, shrink the given netmask until we can reach it.Hmm... this isn't merely a check, it's changing an otherwise configured value.Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/conf.c b/conf.c index 90214f5..5b88547 100644 --- a/conf.c +++ b/conf.c @@ -562,6 +562,10 @@ static unsigned int conf_ip4(unsigned int ifi, ip4->mask = 0xffffffff; } + /* Mask consistency check: ensure we can reach the default gateway */Since this is to handle a very weird situation, we absolutely need a more detailed comment here.+ while ((ip4->addr & ip4->mask) != (ip4->gw & ip4->mask)) + ip4->mask = htonl(ntohl(ip4->mask) << 1); + memcpy(&ip4->addr_seen, &ip4->addr, sizeof(ip4->addr_seen)); if (MAC_IS_ZERO(mac))-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, 3 Nov 2022 14:17:57 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Thu, Nov 03, 2022 at 12:04:41AM +0100, Stefano Brivio wrote:Added in v2.Seen in a Google Compute Engine environment with a machine configured via cloud-init-dhcp, while testing Podman integration for pasta: the assigned address has a /32 netmask, and there's a default route, which can be added on the host because there's another route, also /32, pointing to the default gateway.I'm afraid I'm having trouble getting a good picture of the situation from this description. I think an actual example with addresses would make it much clearer.Well, yes, there's a check, and if that fails, we adjust the netmask, as this paragraph says. Reporting "[c]onsistency check and eventual adjustment" everywhere sounds a bit heavy.This is not a valid configuration as far as I can tell: if the address is configured as /32, it shouldn't be used to reach a gateway outside its derived netmask. However, Linux allows that, and everything works. The problem comes when pasta --config-net sources address and default route from the host, and it can't configure the route in the target namespace because the gateway is invalid. Sourcing more routes than just the default is doable, but probably undesirable: pasta users want to provide connectivity to a container, not reflect exactly whatever trickery is configured on the host. Add a consistency check: if the configured default gateway is not reachable, shrink the given netmask until we can reach it.Hmm... this isn't merely a check, it's changing an otherwise configured value.Added in v2. -- StefanoSigned-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/conf.c b/conf.c index 90214f5..5b88547 100644 --- a/conf.c +++ b/conf.c @@ -562,6 +562,10 @@ static unsigned int conf_ip4(unsigned int ifi, ip4->mask = 0xffffffff; } + /* Mask consistency check: ensure we can reach the default gateway */Since this is to handle a very weird situation, we absolutely need a more detailed comment here.
With --dns-forward, if the host has a loopback address configured as DNS server, we should actually use it to forward queries, but, if --no-map-gw is passed, we shouldn't offer the same address via DHCP, NDP and DHCPv6, because it's not going to be reachable. Problematic configuration: systemd-resolved configuring the usual 127.0.0.53 on the host, and --dns-forward specified with an unrelated address. We still want to forward queries to 127.0.0.53, so we can't drop it from the addresses in IPv4 and IPv6 context, but we shouldn't offer that address either. With this change, I'm only covering the case of automatically configured DNS servers from /etc/resolv.conf. We could extend this to addresses configured with command-line options, but I don't really see a likely use case at this point. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 50 ++++++++++++++++++++++++++++++++++---------------- dhcp.c | 5 +++-- dhcpv6.c | 5 +++-- ndp.c | 6 +++--- passt.h | 8 ++++++-- 5 files changed, 49 insertions(+), 25 deletions(-) diff --git a/conf.c b/conf.c index 5b88547..c4e1030 100644 --- a/conf.c +++ b/conf.c @@ -355,10 +355,11 @@ overlap: */ static void get_dns(struct ctx *c) { + uint32_t *dns4 = &c->ip4.dns[0], *dns4_send = &c->ip4.dns_send[0]; + struct in6_addr *dns6_send = &c->ip6.dns_send[0]; int dns4_set, dns6_set, dnss_set, dns_set, fd; struct in6_addr *dns6 = &c->ip6.dns[0]; struct fqdn *s = c->dns_search; - uint32_t *dns4 = &c->ip4.dns[0]; struct lineread resolvconf; int line_len; char *line, *p, *end; @@ -388,30 +389,45 @@ static void get_dns(struct ctx *c) if (!dns4_set && dns4 - &c->ip4.dns[0] < ARRAY_SIZE(c->ip4.dns) - 1 && inet_pton(AF_INET, p + 1, dns4)) { - /* We can only access local addresses via the gw redirect */ - if (ntohl(*dns4) >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET) { - if (c->no_map_gw) { - *dns4 = 0; + /* Guest or container can only access local + * addresses via local redirect + */ + if (IPV4_IS_LOOPBACK(ntohl(*dns4))) { + if (c->no_map_gw) continue; - } - *dns4 = c->ip4.gw; + + *dns4_send = c->ip4.gw; + } else { + *dns4_send = *dns4; } + + dns4_send++; dns4++; - *dns4 = 0; + + *dns4 = *dns4_send = 0; } if (!dns6_set && dns6 - &c->ip6.dns[0] < ARRAY_SIZE(c->ip6.dns) - 1 && inet_pton(AF_INET6, p + 1, dns6)) { - /* We can only access local addresses via the gw redirect */ + /* Guest or container can only access local + * addresses via local redirect + */ if (IN6_IS_ADDR_LOOPBACK(dns6)) { - if (c->no_map_gw) { - memset(dns6, 0, sizeof(*dns6)); + if (c->no_map_gw) continue; - } - memcpy(dns6, &c->ip6.gw, sizeof(*dns6)); + + memcpy(dns6_send, &c->ip6.gw, + sizeof(*dns6_send)); + } else { + memcpy(dns6_send, &c->ip6.gw, + sizeof(*dns6_send)); } + + dns6_send++; dns6++; + + memset(dns6_send, 0, sizeof(*dns6_send)); memset(dns6, 0, sizeof(*dns6)); } } else if (!dnss_set && strstr(line, "search ") == line && @@ -832,10 +848,11 @@ static void conf_print(const struct ctx *c) inet_ntop(AF_INET, &c->ip4.gw, buf4, sizeof(buf4))); } - for (i = 0; c->ip4.dns[i]; i++) { + for (i = 0; c->ip4.dns_send[i]; i++) { if (!i) info("DNS:"); - inet_ntop(AF_INET, &c->ip4.dns[i], buf4, sizeof(buf4)); + inet_ntop(AF_INET, &c->ip4.dns_send[i], buf4, + sizeof(buf4)); info(" %s", buf4); } @@ -866,7 +883,8 @@ static void conf_print(const struct ctx *c) inet_ntop(AF_INET6, &c->ip6.addr_ll, buf6, sizeof(buf6))); dns6: - for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[i]); i++) { + for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[i]); + i++) { if (!i) info("DNS:"); inet_ntop(AF_INET6, &c->ip6.dns[i], buf6, sizeof(buf6)); diff --git a/dhcp.c b/dhcp.c index d22698a..e816eb6 100644 --- a/dhcp.c +++ b/dhcp.c @@ -355,8 +355,9 @@ int dhcp(const struct ctx *c, const struct pool *p) opts[26].s[1] = c->mtu % 256; } - for (i = 0, opts[6].slen = 0; !c->no_dhcp_dns && c->ip4.dns[i]; i++) { - ((uint32_t *)opts[6].s)[i] = c->ip4.dns[i]; + opts[6].slen = 0; + for (i = 0; !c->no_dhcp_dns && c->ip4.dns_send[i]; i++) { + ((uint32_t *)opts[6].s)[i] = c->ip4.dns_send[i]; opts[6].slen += sizeof(uint32_t); } diff --git a/dhcpv6.c b/dhcpv6.c index e763aed..67262e6 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -379,7 +379,7 @@ static size_t dhcpv6_dns_fill(const struct ctx *c, char *buf, int offset) if (c->no_dhcp_dns) goto search; - for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[i]); i++) { + for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[i]); i++) { if (!i) { srv = (struct opt_dns_servers *)(buf + offset); offset += sizeof(struct opt_hdr); @@ -387,7 +387,8 @@ static size_t dhcpv6_dns_fill(const struct ctx *c, char *buf, int offset) srv->hdr.l = 0; } - memcpy(&srv->addr[i], &c->ip6.dns[i], sizeof(srv->addr[i])); + memcpy(&srv->addr[i], &c->ip6.dns_send[i], + sizeof(srv->addr[i])); srv->hdr.l += sizeof(srv->addr[i]); offset += sizeof(srv->addr[i]); } diff --git a/ndp.c b/ndp.c index 80e1f19..6d79477 100644 --- a/ndp.c +++ b/ndp.c @@ -121,7 +121,7 @@ int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) if (c->no_dhcp_dns) goto dns_done; - for (n = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[n]); n++); + for (n = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[n]); n++); if (n) { *p++ = 25; /* RDNSS */ *p++ = 1 + 2 * n; /* length */ @@ -130,8 +130,8 @@ int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) p += 4; for (i = 0; i < n; i++) { - memcpy(p, &c->ip6.dns[i], 16); /* address */ - p += 16; + memcpy(p, &c->ip6.dns_send[i], 16); + p += 16; /* address */ } for (n = 0; *c->dns_search[n].n; n++) diff --git a/passt.h b/passt.h index 67281db..9f9bf3b 100644 --- a/passt.h +++ b/passt.h @@ -101,7 +101,8 @@ enum passt_modes { * @addr_seen: Latest IPv4 address seen as source from tap * @mask: IPv4 netmask, network order * @gw: Default IPv4 gateway, network order - * @dns: IPv4 DNS addresses, zero-terminated, network order + * @dns: Host IPv4 DNS addresses, zero-terminated, network order + * @dns_send: Offered IPv4 DNS, zero-terminated, network order * @dns_fwd: Address forwarded (UDP) to first IPv4 DNS, network order */ struct ip4_ctx { @@ -110,6 +111,7 @@ struct ip4_ctx { uint32_t mask; uint32_t gw; uint32_t dns[MAXNS + 1]; + uint32_t dns_send[MAXNS + 1]; uint32_t dns_fwd; }; @@ -120,7 +122,8 @@ struct ip4_ctx { * @addr_seen: Latest IPv6 global/site address seen as source from tap * @addr_ll_seen: Latest IPv6 link-local address seen as source from tap * @gw: Default IPv6 gateway - * @dns: IPv6 DNS addresses, zero-terminated + * @dns: Host IPv6 DNS addresses, zero-terminated + * @dns_send: Offered IPv6 DNS addresses, zero-terminated * @dns_fwd: Address forwarded (UDP) to first IPv6 DNS, network order */ struct ip6_ctx { @@ -130,6 +133,7 @@ struct ip6_ctx { struct in6_addr addr_ll_seen; struct in6_addr gw; struct in6_addr dns[MAXNS + 1]; + struct in6_addr dns_send[MAXNS + 1]; struct in6_addr dns_fwd; }; -- 2.35.1
On Thu, Nov 03, 2022 at 12:04:42AM +0100, Stefano Brivio wrote:With --dns-forward, if the host has a loopback address configured as DNS server, we should actually use it to forward queries, but, if --no-map-gw is passed, we shouldn't offer the same address via DHCP, NDP and DHCPv6, because it's not going to be reachable. Problematic configuration: systemd-resolved configuring the usual 127.0.0.53 on the host, and --dns-forward specified with an unrelated address. We still want to forward queries to 127.0.0.53, so we can't drop it from the addresses in IPv4 and IPv6 context,I'm not entirely sure what you mean by that.but we shouldn't offer that address either. With this change, I'm only covering the case of automatically configured DNS servers from /etc/resolv.conf. We could extend this to addresses configured with command-line options, but I don't really see a likely use case at this point. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 50 ++++++++++++++++++++++++++++++++++---------------- dhcp.c | 5 +++-- dhcpv6.c | 5 +++-- ndp.c | 6 +++--- passt.h | 8 ++++++-- 5 files changed, 49 insertions(+), 25 deletions(-) diff --git a/conf.c b/conf.c index 5b88547..c4e1030 100644 --- a/conf.c +++ b/conf.c @@ -355,10 +355,11 @@ overlap: */ static void get_dns(struct ctx *c) { + uint32_t *dns4 = &c->ip4.dns[0], *dns4_send = &c->ip4.dns_send[0]; + struct in6_addr *dns6_send = &c->ip6.dns_send[0]; int dns4_set, dns6_set, dnss_set, dns_set, fd; struct in6_addr *dns6 = &c->ip6.dns[0]; struct fqdn *s = c->dns_search; - uint32_t *dns4 = &c->ip4.dns[0]; struct lineread resolvconf; int line_len; char *line, *p, *end; @@ -388,30 +389,45 @@ static void get_dns(struct ctx *c) if (!dns4_set && dns4 - &c->ip4.dns[0] < ARRAY_SIZE(c->ip4.dns) - 1 && inet_pton(AF_INET, p + 1, dns4)) { - /* We can only access local addresses via the gw redirect */ - if (ntohl(*dns4) >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET) { - if (c->no_map_gw) { - *dns4 = 0; + /* Guest or container can only access local + * addresses via local redirect + */ + if (IPV4_IS_LOOPBACK(ntohl(*dns4))) { + if (c->no_map_gw) continue;In this case shouldn't you still be recording the local address in the dns[] array (but not dns_send[]) since it's a valid nameserver for the host. In which case you'd need to advance the dns4 pointer. If I'm mistaken and you don't want to record it in the dns[] array, then shouldn't you clear it (because otherwise you will record it if this is the last "nameserver" line).- } - *dns4 = c->ip4.gw; + + *dns4_send = c->ip4.gw; + } else { + *dns4_send = *dns4; }I think it would be clearer to update *dns4 if necessary, then set *dns4_send = *dns4 outside the if statement.+ + dns4_send++; dns4++; - *dns4 = 0; + + *dns4 = *dns4_send = 0; } if (!dns6_set && dns6 - &c->ip6.dns[0] < ARRAY_SIZE(c->ip6.dns) - 1 && inet_pton(AF_INET6, p + 1, dns6)) { - /* We can only access local addresses via the gw redirect */ + /* Guest or container can only access local + * addresses via local redirect + */ if (IN6_IS_ADDR_LOOPBACK(dns6)) { - if (c->no_map_gw) { - memset(dns6, 0, sizeof(*dns6)); + if (c->no_map_gw) continue; - } - memcpy(dns6, &c->ip6.gw, sizeof(*dns6)); + + memcpy(dns6_send, &c->ip6.gw, + sizeof(*dns6_send)); + } else { + memcpy(dns6_send, &c->ip6.gw, + sizeof(*dns6_send)); } + + dns6_send++; dns6++; + + memset(dns6_send, 0, sizeof(*dns6_send)); memset(dns6, 0, sizeof(*dns6)); } } else if (!dnss_set && strstr(line, "search ") == line && @@ -832,10 +848,11 @@ static void conf_print(const struct ctx *c) inet_ntop(AF_INET, &c->ip4.gw, buf4, sizeof(buf4))); } - for (i = 0; c->ip4.dns[i]; i++) { + for (i = 0; c->ip4.dns_send[i]; i++) { if (!i) info("DNS:"); - inet_ntop(AF_INET, &c->ip4.dns[i], buf4, sizeof(buf4)); + inet_ntop(AF_INET, &c->ip4.dns_send[i], buf4, + sizeof(buf4)); info(" %s", buf4); } @@ -866,7 +883,8 @@ static void conf_print(const struct ctx *c) inet_ntop(AF_INET6, &c->ip6.addr_ll, buf6, sizeof(buf6))); dns6: - for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[i]); i++) { + for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[i]); + i++) { if (!i) info("DNS:"); inet_ntop(AF_INET6, &c->ip6.dns[i], buf6, sizeof(buf6)); diff --git a/dhcp.c b/dhcp.c index d22698a..e816eb6 100644 --- a/dhcp.c +++ b/dhcp.c @@ -355,8 +355,9 @@ int dhcp(const struct ctx *c, const struct pool *p) opts[26].s[1] = c->mtu % 256; } - for (i = 0, opts[6].slen = 0; !c->no_dhcp_dns && c->ip4.dns[i]; i++) { - ((uint32_t *)opts[6].s)[i] = c->ip4.dns[i]; + opts[6].slen = 0; + for (i = 0; !c->no_dhcp_dns && c->ip4.dns_send[i]; i++) { + ((uint32_t *)opts[6].s)[i] = c->ip4.dns_send[i]; opts[6].slen += sizeof(uint32_t); } diff --git a/dhcpv6.c b/dhcpv6.c index e763aed..67262e6 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -379,7 +379,7 @@ static size_t dhcpv6_dns_fill(const struct ctx *c, char *buf, int offset) if (c->no_dhcp_dns) goto search; - for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[i]); i++) { + for (i = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[i]); i++) { if (!i) { srv = (struct opt_dns_servers *)(buf + offset); offset += sizeof(struct opt_hdr); @@ -387,7 +387,8 @@ static size_t dhcpv6_dns_fill(const struct ctx *c, char *buf, int offset) srv->hdr.l = 0; } - memcpy(&srv->addr[i], &c->ip6.dns[i], sizeof(srv->addr[i])); + memcpy(&srv->addr[i], &c->ip6.dns_send[i], + sizeof(srv->addr[i])); srv->hdr.l += sizeof(srv->addr[i]); offset += sizeof(srv->addr[i]); } diff --git a/ndp.c b/ndp.c index 80e1f19..6d79477 100644 --- a/ndp.c +++ b/ndp.c @@ -121,7 +121,7 @@ int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) if (c->no_dhcp_dns) goto dns_done; - for (n = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[n]); n++); + for (n = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_send[n]); n++); if (n) { *p++ = 25; /* RDNSS */ *p++ = 1 + 2 * n; /* length */ @@ -130,8 +130,8 @@ int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) p += 4; for (i = 0; i < n; i++) { - memcpy(p, &c->ip6.dns[i], 16); /* address */ - p += 16; + memcpy(p, &c->ip6.dns_send[i], 16); + p += 16; /* address */ } for (n = 0; *c->dns_search[n].n; n++) diff --git a/passt.h b/passt.h index 67281db..9f9bf3b 100644 --- a/passt.h +++ b/passt.h @@ -101,7 +101,8 @@ enum passt_modes { * @addr_seen: Latest IPv4 address seen as source from tap * @mask: IPv4 netmask, network order * @gw: Default IPv4 gateway, network order - * @dns: IPv4 DNS addresses, zero-terminated, network order + * @dns: Host IPv4 DNS addresses, zero-terminated, network order + * @dns_send: Offered IPv4 DNS, zero-terminated, network order * @dns_fwd: Address forwarded (UDP) to first IPv4 DNS, network order */ struct ip4_ctx { @@ -110,6 +111,7 @@ struct ip4_ctx { uint32_t mask; uint32_t gw; uint32_t dns[MAXNS + 1]; + uint32_t dns_send[MAXNS + 1]; uint32_t dns_fwd; }; @@ -120,7 +122,8 @@ struct ip4_ctx { * @addr_seen: Latest IPv6 global/site address seen as source from tap * @addr_ll_seen: Latest IPv6 link-local address seen as source from tap * @gw: Default IPv6 gateway - * @dns: IPv6 DNS addresses, zero-terminated + * @dns: Host IPv6 DNS addresses, zero-terminated + * @dns_send: Offered IPv6 DNS addresses, zero-terminated * @dns_fwd: Address forwarded (UDP) to first IPv6 DNS, network order */ struct ip6_ctx { @@ -130,6 +133,7 @@ struct ip6_ctx { struct in6_addr addr_ll_seen; struct in6_addr gw; struct in6_addr dns[MAXNS + 1]; + struct in6_addr dns_send[MAXNS + 1]; struct in6_addr dns_fwd; };-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, 3 Nov 2022 14:37:18 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Thu, Nov 03, 2022 at 12:04:42AM +0100, Stefano Brivio wrote:Hopefully clarified enough in v2.With --dns-forward, if the host has a loopback address configured as DNS server, we should actually use it to forward queries, but, if --no-map-gw is passed, we shouldn't offer the same address via DHCP, NDP and DHCPv6, because it's not going to be reachable. Problematic configuration: systemd-resolved configuring the usual 127.0.0.53 on the host, and --dns-forward specified with an unrelated address. We still want to forward queries to 127.0.0.53, so we can't drop it from the addresses in IPv4 and IPv6 context,I'm not entirely sure what you mean by that.Oops, right, fixed in v2.but we shouldn't offer that address either. With this change, I'm only covering the case of automatically configured DNS servers from /etc/resolv.conf. We could extend this to addresses configured with command-line options, but I don't really see a likely use case at this point. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 50 ++++++++++++++++++++++++++++++++++---------------- dhcp.c | 5 +++-- dhcpv6.c | 5 +++-- ndp.c | 6 +++--- passt.h | 8 ++++++-- 5 files changed, 49 insertions(+), 25 deletions(-) diff --git a/conf.c b/conf.c index 5b88547..c4e1030 100644 --- a/conf.c +++ b/conf.c @@ -355,10 +355,11 @@ overlap: */ static void get_dns(struct ctx *c) { + uint32_t *dns4 = &c->ip4.dns[0], *dns4_send = &c->ip4.dns_send[0]; + struct in6_addr *dns6_send = &c->ip6.dns_send[0]; int dns4_set, dns6_set, dnss_set, dns_set, fd; struct in6_addr *dns6 = &c->ip6.dns[0]; struct fqdn *s = c->dns_search; - uint32_t *dns4 = &c->ip4.dns[0]; struct lineread resolvconf; int line_len; char *line, *p, *end; @@ -388,30 +389,45 @@ static void get_dns(struct ctx *c) if (!dns4_set && dns4 - &c->ip4.dns[0] < ARRAY_SIZE(c->ip4.dns) - 1 && inet_pton(AF_INET, p + 1, dns4)) { - /* We can only access local addresses via the gw redirect */ - if (ntohl(*dns4) >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET) { - if (c->no_map_gw) { - *dns4 = 0; + /* Guest or container can only access local + * addresses via local redirect + */ + if (IPV4_IS_LOOPBACK(ntohl(*dns4))) { + if (c->no_map_gw) continue;In this case shouldn't you still be recording the local address in the dns[] array (but not dns_send[]) since it's a valid nameserver for the host. In which case you'd need to advance the dns4 pointer.If I'm mistaken and you don't want to record it in the dns[] array, then shouldn't you clear it (because otherwise you will record it if this is the last "nameserver" line).Probably not relevant now that I fixed the case you mentioned above. -- Stefano- } - *dns4 = c->ip4.gw; + + *dns4_send = c->ip4.gw; + } else { + *dns4_send = *dns4; }I think it would be clearer to update *dns4 if necessary, then set *dns4_send = *dns4 outside the if statement.
Now that we allow loopback DNS addresses to be used as targets for forwarding, we need to check if DNS answers come from those targets, before deciding to eventually remap traffic for local redirects. Otherwise, the source address won't match the one configured as forwarder, which means that the guest or the container will refuse those responses. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- udp.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/udp.c b/udp.c index 4b201d3..7c77e09 100644 --- a/udp.c +++ b/udp.c @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, src = ntohl(b->s_in.sin_addr.s_addr); src_port = ntohs(b->s_in.sin_port); - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) { + b->iph.saddr = c->ip4.dns_fwd; + } else if (IPV4_IS_LOOPBACK(src) || src == INADDR_ANY || + src == ntohl(c->ip4.addr_seen)) { b->iph.saddr = c->ip4.gw; udp_tap_map[V4][src_port].ts = now->tv_sec; udp_tap_map[V4][src_port].flags |= PORT_LOCAL; @@ -692,9 +694,6 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, udp_tap_map[V4][src_port].flags |= PORT_LOOPBACK; bitmap_set(udp_act[V4][UDP_ACT_TAP], src_port); - } else if (c->ip4.dns_fwd && - src == htonl(c->ip4.dns[0]) && src_port == 53) { - b->iph.saddr = c->ip4.dns_fwd; } else { b->iph.saddr = b->s_in.sin_addr.s_addr; } @@ -770,6 +769,10 @@ static void udp_sock_fill_data_v6(const struct ctx *c, int n, if (IN6_IS_ADDR_LINKLOCAL(src)) { b->ip6h.daddr = c->ip6.addr_ll_seen; b->ip6h.saddr = b->s_in6.sin6_addr; + } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_fwd) && + IN6_ARE_ADDR_EQUAL(src, &c->ip6.dns[0]) && src_port == 53) { + b->ip6h.daddr = c->ip6.addr_seen; + b->ip6h.saddr = c->ip6.dns_fwd; } else if (IN6_IS_ADDR_LOOPBACK(src) || IN6_ARE_ADDR_EQUAL(src, &c->ip6.addr_seen) || IN6_ARE_ADDR_EQUAL(src, &c->ip6.addr)) { @@ -794,10 +797,6 @@ static void udp_sock_fill_data_v6(const struct ctx *c, int n, udp_tap_map[V6][src_port].flags &= ~PORT_GUA; bitmap_set(udp_act[V6][UDP_ACT_TAP], src_port); - } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_fwd) && - IN6_ARE_ADDR_EQUAL(src, &c->ip6.dns[0]) && src_port == 53) { - b->ip6h.daddr = c->ip6.addr_seen; - b->ip6h.saddr = c->ip6.dns_fwd; } else { b->ip6h.daddr = c->ip6.addr_seen; b->ip6h.saddr = b->s_in6.sin6_addr; -- 2.35.1
On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote:Now that we allow loopback DNS addresses to be used as targets for forwarding, we need to check if DNS answers come from those targets, before deciding to eventually remap traffic for local redirects. Otherwise, the source address won't match the one configured as forwarder, which means that the guest or the container will refuse those responses. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- udp.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/udp.c b/udp.c index 4b201d3..7c77e09 100644 --- a/udp.c +++ b/udp.c @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, src = ntohl(b->s_in.sin_addr.s_addr); src_port = ntohs(b->s_in.sin_port); - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) {I guess this is not a newly introduced bug, but for the case of multiple host nameservers, don't you need to check against everything in the ip4.dns[] array, not just entry 0?+ b->iph.saddr = c->ip4.dns_fwd; + } else if (IPV4_IS_LOOPBACK(src) || src == INADDR_ANY || + src == ntohl(c->ip4.addr_seen)) { b->iph.saddr = c->ip4.gw; udp_tap_map[V4][src_port].ts = now->tv_sec; udp_tap_map[V4][src_port].flags |= PORT_LOCAL; @@ -692,9 +694,6 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, udp_tap_map[V4][src_port].flags |= PORT_LOOPBACK; bitmap_set(udp_act[V4][UDP_ACT_TAP], src_port); - } else if (c->ip4.dns_fwd && - src == htonl(c->ip4.dns[0]) && src_port == 53) { - b->iph.saddr = c->ip4.dns_fwd; } else { b->iph.saddr = b->s_in.sin_addr.s_addr; } @@ -770,6 +769,10 @@ static void udp_sock_fill_data_v6(const struct ctx *c, int n, if (IN6_IS_ADDR_LINKLOCAL(src)) { b->ip6h.daddr = c->ip6.addr_ll_seen; b->ip6h.saddr = b->s_in6.sin6_addr; + } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_fwd) && + IN6_ARE_ADDR_EQUAL(src, &c->ip6.dns[0]) && src_port == 53) { + b->ip6h.daddr = c->ip6.addr_seen; + b->ip6h.saddr = c->ip6.dns_fwd; } else if (IN6_IS_ADDR_LOOPBACK(src) || IN6_ARE_ADDR_EQUAL(src, &c->ip6.addr_seen) || IN6_ARE_ADDR_EQUAL(src, &c->ip6.addr)) { @@ -794,10 +797,6 @@ static void udp_sock_fill_data_v6(const struct ctx *c, int n, udp_tap_map[V6][src_port].flags &= ~PORT_GUA; bitmap_set(udp_act[V6][UDP_ACT_TAP], src_port); - } else if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns_fwd) && - IN6_ARE_ADDR_EQUAL(src, &c->ip6.dns[0]) && src_port == 53) { - b->ip6h.daddr = c->ip6.addr_seen; - b->ip6h.saddr = c->ip6.dns_fwd; } else { b->ip6h.daddr = c->ip6.addr_seen; b->ip6h.saddr = b->s_in6.sin6_addr;-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, 3 Nov 2022 14:42:13 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote:No, because that's the only one we're using as target for forwarded queries -- and DNS answers we want to check here are only the forwarded ones. -- StefanoNow that we allow loopback DNS addresses to be used as targets for forwarding, we need to check if DNS answers come from those targets, before deciding to eventually remap traffic for local redirects. Otherwise, the source address won't match the one configured as forwarder, which means that the guest or the container will refuse those responses. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- udp.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/udp.c b/udp.c index 4b201d3..7c77e09 100644 --- a/udp.c +++ b/udp.c @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, src = ntohl(b->s_in.sin_addr.s_addr); src_port = ntohs(b->s_in.sin_port); - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) {I guess this is not a newly introduced bug, but for the case of multiple host nameservers, don't you need to check against everything in the ip4.dns[] array, not just entry 0?
On Thu, Nov 03, 2022 at 07:42:51AM +0100, Stefano Brivio wrote:On Thu, 3 Nov 2022 14:42:13 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:*thinks* .. ok, that makes sense. But if that's the case, won't ip4.dns[0] be the only entry in ip4.dns[] we use for anything at all? Can we drop the table and just keep one entry? -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonOn Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote:No, because that's the only one we're using as target for forwarded queries -- and DNS answers we want to check here are only the forwarded ones.Now that we allow loopback DNS addresses to be used as targets for forwarding, we need to check if DNS answers come from those targets, before deciding to eventually remap traffic for local redirects. Otherwise, the source address won't match the one configured as forwarder, which means that the guest or the container will refuse those responses. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- udp.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/udp.c b/udp.c index 4b201d3..7c77e09 100644 --- a/udp.c +++ b/udp.c @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, src = ntohl(b->s_in.sin_addr.s_addr); src_port = ntohs(b->s_in.sin_port); - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) {I guess this is not a newly introduced bug, but for the case of multiple host nameservers, don't you need to check against everything in the ip4.dns[] array, not just entry 0?
On Sat, 5 Nov 2022 12:19:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Thu, Nov 03, 2022 at 07:42:51AM +0100, Stefano Brivio wrote:Now that we have ip{4,6}.dns_send[], yes. We could rename .dns_send[] back to .dns[] and change the current .dns[] to .own_dns, or .fwd_dns_target, something like that. Other naming ideas welcome. I wanted the change in 2/3 to be simple and fix-like, but I can do this rework soon so that you don't _have_ to. :) -- StefanoOn Thu, 3 Nov 2022 14:42:13 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:*thinks* .. ok, that makes sense. But if that's the case, won't ip4.dns[0] be the only entry in ip4.dns[] we use for anything at all? Can we drop the table and just keep one entry?On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote:No, because that's the only one we're using as target for forwarded queries -- and DNS answers we want to check here are only the forwarded ones.Now that we allow loopback DNS addresses to be used as targets for forwarding, we need to check if DNS answers come from those targets, before deciding to eventually remap traffic for local redirects. Otherwise, the source address won't match the one configured as forwarder, which means that the guest or the container will refuse those responses. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- udp.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/udp.c b/udp.c index 4b201d3..7c77e09 100644 --- a/udp.c +++ b/udp.c @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, src = ntohl(b->s_in.sin_addr.s_addr); src_port = ntohs(b->s_in.sin_port); - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) {I guess this is not a newly introduced bug, but for the case of multiple host nameservers, don't you need to check against everything in the ip4.dns[] array, not just entry 0?
On Sat, Nov 05, 2022 at 08:22:23AM +0100, Stefano Brivio wrote:On Sat, 5 Nov 2022 12:19:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Right, that's what I meant.On Thu, Nov 03, 2022 at 07:42:51AM +0100, Stefano Brivio wrote:Now that we have ip{4,6}.dns_send[], yes.On Thu, 3 Nov 2022 14:42:13 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:*thinks* .. ok, that makes sense. But if that's the case, won't ip4.dns[0] be the only entry in ip4.dns[] we use for anything at all? Can we drop the table and just keep one entry?On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote: > Now that we allow loopback DNS addresses to be used as targets for > forwarding, we need to check if DNS answers come from those targets, > before deciding to eventually remap traffic for local redirects. > > Otherwise, the source address won't match the one configured as > forwarder, which means that the guest or the container will refuse > those responses. > > Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> > --- > udp.c | 17 ++++++++--------- > 1 file changed, 8 insertions(+), 9 deletions(-) > > diff --git a/udp.c b/udp.c > index 4b201d3..7c77e09 100644 > --- a/udp.c > +++ b/udp.c > @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, > src = ntohl(b->s_in.sin_addr.s_addr); > src_port = ntohs(b->s_in.sin_port); > > - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || > - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { > + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) { I guess this is not a newly introduced bug, but for the case of multiple host nameservers, don't you need to check against everything in the ip4.dns[] array, not just entry 0?No, because that's the only one we're using as target for forwarded queries -- and DNS answers we want to check here are only the forwarded ones.We could rename .dns_send[] back to .dns[] and change the currentRight, I think dns[] is a better name for it..dns[] to .own_dns, or .fwd_dns_target, something like that. Other naming ideas welcome.Yeah, I find the current dns_fwd name not very illuminating either. *thinks* does it even make sense for dns_fwd not to be in dns_send? We're intercepting queries the guest sends to @dns_fwd, so surely we should also be advertising it to the guest. So what about: @dns: Primary DNS server advertised to guest - may be a fake address we'll intercept @dns_extra[]:Additional DNS servers advertised to guest. Must be real servers the guest can address without translation @host_dns: Host side DNS server (may be localhost or another address that's not guest accessible) The DHCP code advertises both @dns and @dns_extra, and that's the *only* place @dns_extra is used. UDP intercepts outbound packets for @dns and redirects them to @host_dns, likewise masquerading inbound packets from @host_dns to appear to have come from @dns.I wanted the change in 2/3 to be simple and fix-like, but I can do this rework soon so that you don't _have_ to. :)-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Mon, 7 Nov 2022 12:08:59 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Sat, Nov 05, 2022 at 08:22:23AM +0100, Stefano Brivio wrote:I wouldn't be so sure of that "surely". In the Podman test case where I hit this issue, I use Podman to write to /etc/resolv.conf directly, no DHCP/NDP/DHCPv6 involved, and things work. That doesn't automatically imply a use case for *not* advertising it, but we have several ways this can work without advertising anything, so there are also chances somebody might not want to advertise that in some obscure use case.On Sat, 5 Nov 2022 12:19:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Right, that's what I meant.On Thu, Nov 03, 2022 at 07:42:51AM +0100, Stefano Brivio wrote:Now that we have ip{4,6}.dns_send[], yes.On Thu, 3 Nov 2022 14:42:13 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote: > On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote: > > Now that we allow loopback DNS addresses to be used as targets for > > forwarding, we need to check if DNS answers come from those targets, > > before deciding to eventually remap traffic for local redirects. > > > > Otherwise, the source address won't match the one configured as > > forwarder, which means that the guest or the container will refuse > > those responses. > > > > Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> > > --- > > udp.c | 17 ++++++++--------- > > 1 file changed, 8 insertions(+), 9 deletions(-) > > > > diff --git a/udp.c b/udp.c > > index 4b201d3..7c77e09 100644 > > --- a/udp.c > > +++ b/udp.c > > @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, > > src = ntohl(b->s_in.sin_addr.s_addr); > > src_port = ntohs(b->s_in.sin_port); > > > > - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || > > - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { > > + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) { > > I guess this is not a newly introduced bug, but for the case of > multiple host nameservers, don't you need to check against everything > in the ip4.dns[] array, not just entry 0? No, because that's the only one we're using as target for forwarded queries -- and DNS answers we want to check here are only the forwarded ones.*thinks* .. ok, that makes sense. But if that's the case, won't ip4.dns[0] be the only entry in ip4.dns[] we use for anything at all? Can we drop the table and just keep one entry?We could rename .dns_send[] back to .dns[] and change the currentRight, I think dns[] is a better name for it..dns[] to .own_dns, or .fwd_dns_target, something like that. Other naming ideas welcome.Yeah, I find the current dns_fwd name not very illuminating either. *thinks* does it even make sense for dns_fwd not to be in dns_send? We're intercepting queries the guest sends to @dns_fwd, so surely we should also be advertising it to the guest.So what about: @dns: Primary DNS server advertised to guest - may be a fake address we'll intercept @dns_extra[]:Additional DNS servers advertised to guest. Must be real servers the guest can address without translation @host_dns: Host side DNS server (may be localhost or another address that's not guest accessible) The DHCP code advertises both @dns and @dns_extra, and that's the *only* place @dns_extra is used.Also NDP and DHCPv6 use that, and checking one item plus one array in three places (difficult to share that code path) is uglier than just the array.UDP intercepts outbound packets for @dns and redirects them to @host_dns, likewise masquerading inbound packets from @host_dns to appear to have come from @dns.I see the general point, but we would still need a flag to allow users to disable the redirection. Given that, I'd rather go with dns, host_dns, and fwd_dns, it looks simpler. -- Stefano
On Mon, Nov 07, 2022 at 10:51:21AM +0100, Stefano Brivio wrote:On Mon, 7 Nov 2022 12:08:59 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Right, but only the case for not advertising it matters here, and I don't see one. @dns_fwd (or @dns_match, as we discussed calling it instead) is explicitly a virtual DNS server available to the guest. Whatever method the guest does use to configure itself, we should allow it to discover this via DHCP (or DHCPv6 or NDP).On Sat, Nov 05, 2022 at 08:22:23AM +0100, Stefano Brivio wrote:I wouldn't be so sure of that "surely". In the Podman test case where I hit this issue, I use Podman to write to /etc/resolv.conf directly, no DHCP/NDP/DHCPv6 involved, and things work. That doesn't automatically imply a use case for *not* advertising it, but we have several ways this can work without advertising anything, so there are also chances somebody might not want to advertise that in some obscure use case.On Sat, 5 Nov 2022 12:19:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Right, that's what I meant.On Thu, Nov 03, 2022 at 07:42:51AM +0100, Stefano Brivio wrote: > On Thu, 3 Nov 2022 14:42:13 +1100 > David Gibson <david(a)gibson.dropbear.id.au> wrote: > > > On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote: > > > Now that we allow loopback DNS addresses to be used as targets for > > > forwarding, we need to check if DNS answers come from those targets, > > > before deciding to eventually remap traffic for local redirects. > > > > > > Otherwise, the source address won't match the one configured as > > > forwarder, which means that the guest or the container will refuse > > > those responses. > > > > > > Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> > > > --- > > > udp.c | 17 ++++++++--------- > > > 1 file changed, 8 insertions(+), 9 deletions(-) > > > > > > diff --git a/udp.c b/udp.c > > > index 4b201d3..7c77e09 100644 > > > --- a/udp.c > > > +++ b/udp.c > > > @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, > > > src = ntohl(b->s_in.sin_addr.s_addr); > > > src_port = ntohs(b->s_in.sin_port); > > > > > > - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || > > > - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { > > > + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) { > > > > I guess this is not a newly introduced bug, but for the case of > > multiple host nameservers, don't you need to check against everything > > in the ip4.dns[] array, not just entry 0? > > No, because that's the only one we're using as target for forwarded > queries -- and DNS answers we want to check here are only the forwarded > ones. *thinks* .. ok, that makes sense. But if that's the case, won't ip4.dns[0] be the only entry in ip4.dns[] we use for anything at all? Can we drop the table and just keep one entry?Now that we have ip{4,6}.dns_send[], yes.We could rename .dns_send[] back to .dns[] and change the currentRight, I think dns[] is a better name for it..dns[] to .own_dns, or .fwd_dns_target, something like that. Other naming ideas welcome.Yeah, I find the current dns_fwd name not very illuminating either. *thinks* does it even make sense for dns_fwd not to be in dns_send? We're intercepting queries the guest sends to @dns_fwd, so surely we should also be advertising it to the guest.-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonSo what about: @dns: Primary DNS server advertised to guest - may be a fake address we'll intercept @dns_extra[]:Additional DNS servers advertised to guest. Must be real servers the guest can address without translation @host_dns: Host side DNS server (may be localhost or another address that's not guest accessible) The DHCP code advertises both @dns and @dns_extra, and that's the *only* place @dns_extra is used.Also NDP and DHCPv6 use that, and checking one item plus one array in three places (difficult to share that code path) is uglier than just the array.UDP intercepts outbound packets for @dns and redirects them to @host_dns, likewise masquerading inbound packets from @host_dns to appear to have come from @dns.I see the general point, but we would still need a flag to allow users to disable the redirection. Given that, I'd rather go with dns, host_dns, and fwd_dns, it looks simpler.
On Tue, 8 Nov 2022 16:51:35 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Mon, Nov 07, 2022 at 10:51:21AM +0100, Stefano Brivio wrote:Rather hypothetical: you don't want the guest/container to use a given address as resolver. You know that that address might be in its resolv.conf(5) because you don't have control over the image, wish to override it if possible, and at the same time keep a safety net. Slightly unrelated: we're talking about this in the perspective of getting rid of an explicit @dns_fwd/@dns_match. This would become a flag, indicating we should forward queries originally directed to 1. dns[0]... or 2. anything in dns[]? If it's just about 1. dns[0], we're forcing that address to be the first advertised resolver. If it's about 2. dns[], we're not giving anymore the possibility of forwarding queries originally directed to one a single address. -- StefanoOn Mon, 7 Nov 2022 12:08:59 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Right, but only the case for not advertising it matters here, and I don't see one. @dns_fwd (or @dns_match, as we discussed calling it instead) is explicitly a virtual DNS server available to the guest. Whatever method the guest does use to configure itself, we should allow it to discover this via DHCP (or DHCPv6 or NDP).On Sat, Nov 05, 2022 at 08:22:23AM +0100, Stefano Brivio wrote:I wouldn't be so sure of that "surely". In the Podman test case where I hit this issue, I use Podman to write to /etc/resolv.conf directly, no DHCP/NDP/DHCPv6 involved, and things work. That doesn't automatically imply a use case for *not* advertising it, but we have several ways this can work without advertising anything, so there are also chances somebody might not want to advertise that in some obscure use case.On Sat, 5 Nov 2022 12:19:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote: > On Thu, Nov 03, 2022 at 07:42:51AM +0100, Stefano Brivio wrote: > > On Thu, 3 Nov 2022 14:42:13 +1100 > > David Gibson <david(a)gibson.dropbear.id.au> wrote: > > > > > On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote: > > > > Now that we allow loopback DNS addresses to be used as targets for > > > > forwarding, we need to check if DNS answers come from those targets, > > > > before deciding to eventually remap traffic for local redirects. > > > > > > > > Otherwise, the source address won't match the one configured as > > > > forwarder, which means that the guest or the container will refuse > > > > those responses. > > > > > > > > Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> > > > > --- > > > > udp.c | 17 ++++++++--------- > > > > 1 file changed, 8 insertions(+), 9 deletions(-) > > > > > > > > diff --git a/udp.c b/udp.c > > > > index 4b201d3..7c77e09 100644 > > > > --- a/udp.c > > > > +++ b/udp.c > > > > @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, > > > > src = ntohl(b->s_in.sin_addr.s_addr); > > > > src_port = ntohs(b->s_in.sin_port); > > > > > > > > - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || > > > > - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { > > > > + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) { > > > > > > I guess this is not a newly introduced bug, but for the case of > > > multiple host nameservers, don't you need to check against everything > > > in the ip4.dns[] array, not just entry 0? > > > > No, because that's the only one we're using as target for forwarded > > queries -- and DNS answers we want to check here are only the forwarded > > ones. > > *thinks* .. ok, that makes sense. But if that's the case, won't > ip4.dns[0] be the only entry in ip4.dns[] we use for anything at all? > Can we drop the table and just keep one entry? Now that we have ip{4,6}.dns_send[], yes.Right, that's what I meant.We could rename .dns_send[] back to .dns[] and change the currentRight, I think dns[] is a better name for it..dns[] to .own_dns, or .fwd_dns_target, something like that. Other naming ideas welcome.Yeah, I find the current dns_fwd name not very illuminating either. *thinks* does it even make sense for dns_fwd not to be in dns_send? We're intercepting queries the guest sends to @dns_fwd, so surely we should also be advertising it to the guest.
On Tue, Nov 08, 2022 at 07:22:22AM +0100, Stefano Brivio wrote:On Tue, 8 Nov 2022 16:51:35 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Yeah, I guess. Seems pretty contrived.On Mon, Nov 07, 2022 at 10:51:21AM +0100, Stefano Brivio wrote:Rather hypothetical: you don't want the guest/container to use a given address as resolver. You know that that address might be in its resolv.conf(5) because you don't have control over the image, wish to override it if possible, and at the same time keep a safety net.On Mon, 7 Nov 2022 12:08:59 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Right, but only the case for not advertising it matters here, and I don't see one. @dns_fwd (or @dns_match, as we discussed calling it instead) is explicitly a virtual DNS server available to the guest. Whatever method the guest does use to configure itself, we should allow it to discover this via DHCP (or DHCPv6 or NDP).On Sat, Nov 05, 2022 at 08:22:23AM +0100, Stefano Brivio wrote: > On Sat, 5 Nov 2022 12:19:55 +1100 > David Gibson <david(a)gibson.dropbear.id.au> wrote: > > > On Thu, Nov 03, 2022 at 07:42:51AM +0100, Stefano Brivio wrote: > > > On Thu, 3 Nov 2022 14:42:13 +1100 > > > David Gibson <david(a)gibson.dropbear.id.au> wrote: > > > > > > > On Thu, Nov 03, 2022 at 12:04:43AM +0100, Stefano Brivio wrote: > > > > > Now that we allow loopback DNS addresses to be used as targets for > > > > > forwarding, we need to check if DNS answers come from those targets, > > > > > before deciding to eventually remap traffic for local redirects. > > > > > > > > > > Otherwise, the source address won't match the one configured as > > > > > forwarder, which means that the guest or the container will refuse > > > > > those responses. > > > > > > > > > > Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> > > > > > --- > > > > > udp.c | 17 ++++++++--------- > > > > > 1 file changed, 8 insertions(+), 9 deletions(-) > > > > > > > > > > diff --git a/udp.c b/udp.c > > > > > index 4b201d3..7c77e09 100644 > > > > > --- a/udp.c > > > > > +++ b/udp.c > > > > > @@ -680,8 +680,10 @@ static void udp_sock_fill_data_v4(const struct ctx *c, int n, > > > > > src = ntohl(b->s_in.sin_addr.s_addr); > > > > > src_port = ntohs(b->s_in.sin_port); > > > > > > > > > > - if (src >> IN_CLASSA_NSHIFT == IN_LOOPBACKNET || > > > > > - src == INADDR_ANY || src == ntohl(c->ip4.addr_seen)) { > > > > > + if (c->ip4.dns_fwd && src == htonl(c->ip4.dns[0]) && src_port == 53) { > > > > > > > > I guess this is not a newly introduced bug, but for the case of > > > > multiple host nameservers, don't you need to check against everything > > > > in the ip4.dns[] array, not just entry 0? > > > > > > No, because that's the only one we're using as target for forwarded > > > queries -- and DNS answers we want to check here are only the forwarded > > > ones. > > > > *thinks* .. ok, that makes sense. But if that's the case, won't > > ip4.dns[0] be the only entry in ip4.dns[] we use for anything at all? > > Can we drop the table and just keep one entry? > > Now that we have ip{4,6}.dns_send[], yes. Right, that's what I meant. > We could rename .dns_send[] back to .dns[] and change the current Right, I think dns[] is a better name for it. > .dns[] to .own_dns, or .fwd_dns_target, something like that. Other naming > ideas welcome. Yeah, I find the current dns_fwd name not very illuminating either. *thinks* does it even make sense for dns_fwd not to be in dns_send? We're intercepting queries the guest sends to @dns_fwd, so surely we should also be advertising it to the guest.I wouldn't be so sure of that "surely". In the Podman test case where I hit this issue, I use Podman to write to /etc/resolv.conf directly, no DHCP/NDP/DHCPv6 involved, and things work. That doesn't automatically imply a use case for *not* advertising it, but we have several ways this can work without advertising anything, so there are also chances somebody might not want to advertise that in some obscure use case.Slightly unrelated: we're talking about this in the perspective of getting rid of an explicit @dns_fwd/@dns_match. This would become a flag, indicating we should forward queries originally directed to 1. dns[0]... or 2. anything in dns[]? If it's just about 1. dns[0], we're forcing that address to be the first advertised resolver. If it's about 2. dns[], we're not giving anymore the possibility of forwarding queries originally directed to one a single address.Hmm... yes, those are fair points. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, Nov 03, 2022 at 12:04:40AM +0100, Stefano Brivio wrote:These patches work around or fix some issues found while testing the Podman integration for pasta in Google Compute Engine environments.Drat. These will conflict pretty badly with the IPv4 address endian fixes I just posted. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, 3 Nov 2022 14:13:39 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Thu, Nov 03, 2022 at 12:04:40AM +0100, Stefano Brivio wrote:Not a big issue, I just rebased on top of it. -- StefanoThese patches work around or fix some issues found while testing the Podman integration for pasta in Google Compute Engine environments.Drat. These will conflict pretty badly with the IPv4 address endian fixes I just posted.