v3: In 2/2, *really* pass the correct bind_addr reference to sock_l4() in udp_tap_handler() -- I just sent v2 from the wrong directory :/ Stefano Brivio (2): conf, passt.h: Rename "outbound" interface to "template" interface conf, icmp, tcp, udp: Add options to bind to outbound address and interface conf.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++------ icmp.c | 24 +++++++++++++--- passt.1 | 19 ++++++++++++ passt.h | 14 +++++++-- tcp.c | 60 ++++++++++++++++++++++++++++++++++++++ udp.c | 54 +++++++++++++++++++++++----------- 6 files changed, 228 insertions(+), 32 deletions(-) -- 2.39.2
In preparation for the next patch, make it clear that the first routable interface fetched via netlink, or the one configured via -i/--interface, is simply used as template to copy addresses and routes, not an interface we actually use to derive the source address (which will be _bound to_) for outgoing packets. The man page and usage message appear to be already clear enough. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 11 +++++++---- passt.h | 4 ++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/conf.c b/conf.c index 0e512f4..3aa3314 100644 --- a/conf.c +++ b/conf.c @@ -903,10 +903,13 @@ static void conf_print(const struct ctx *c) char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ]; int i; - if (c->ifi4) - info("Outbound interface (IPv4): %s", if_indextoname(c->ifi4, ifn)); - if (c->ifi6) - info("Outbound interface (IPv6): %s", if_indextoname(c->ifi6, ifn)); + info("Template interface: %s%s%s%s%s", + c->ifi4 ? if_indextoname(c->ifi4, ifn) : "", + c->ifi4 ? " (IPv4)" : "", + (c->ifi4 && c->ifi6) ? ", " : "", + c->ifi6 ? if_indextoname(c->ifi6, ifn) : "", + c->ifi6 ? " (IPv6)" : ""); + if (c->mode == MODE_PASTA) info("Namespace interface: %s", c->pasta_ifn); diff --git a/passt.h b/passt.h index e0383eb..cc60c84 100644 --- a/passt.h +++ b/passt.h @@ -164,10 +164,10 @@ struct ip6_ctx { * @fd_tap: AF_UNIX socket, tuntap device, or pre-opened socket * @mac: Host MAC address * @mac_guest: MAC address of guest or namespace, seen or configured - * @ifi4: Index of routable interface for IPv4, 0 if IPv4 disabled + * @ifi4: Index of template interface for IPv4, 0 if IPv4 disabled * @ip: IPv4 configuration * @dns_search: DNS search list - * @ifi6: Index of routable interface for IPv6, 0 if IPv6 disabled + * @ifi6: Index of template interface for IPv6, 0 if IPv6 disabled * @ip6: IPv6 configuration * @pasta_ifn: Name of namespace interface for pasta * @pasta_ifn: Index of namespace interface for pasta -- 2.39.2
I didn't notice earlier: libslirp (and slirp4netns) supports binding outbound sockets to specific IPv4 and IPv6 addresses, to force the source addresse selection. If we want to claim feature parity, we should implement that as well. Further, Podman supports specifying outbound interfaces as well, but this is simply done by resolving the primary address for an interface when the network back-end is started. However, since kernel version 5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for non-root users"), we can actually bind to a specific interface name, which doesn't need to be validated in advance. Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses, and --outbound-ip4 and --outbound-ip6 to bind IPv4 and IPv6 sockets to given interfaces. For ICMP and UDP, we call sock_l4() to open outbound sockets, as we already needed to bind to given ports or echo identifiers, and we can bind() a socket only once: there, pass address (if any) and interface (if any) for the existing bind() and setsockopt() calls. For TCP, in general, we wouldn't otherwise bind sockets. Add a specific helper to do that. For UDP outbound sockets, we need to know if the final destination of the socket is a loopback address, before we decide whether it makes sense to bind the socket at all: move the block mangling the address destination before the creation of the socket in the IPv4 path. This was already the case for the IPv6 path. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++---- icmp.c | 24 +++++++++++++++--- passt.1 | 19 ++++++++++++++ passt.h | 10 ++++++++ tcp.c | 60 ++++++++++++++++++++++++++++++++++++++++++++ udp.c | 54 ++++++++++++++++++++++++++------------- 6 files changed, 219 insertions(+), 26 deletions(-) diff --git a/conf.c b/conf.c index 3aa3314..15506ec 100644 --- a/conf.c +++ b/conf.c @@ -776,6 +776,13 @@ static void usage(const char *name) info( " default: gateway from interface with default route"); info( " -i, --interface NAME Interface for addresses and routes"); info( " default: interface with first default route"); + info( " -o, --outbound ADDR Bind to address as outbound source"); + info( " can be specified zero to two times (for IPv4 and IPv6)"); + info( " default: use source address from routing tables"); + info( " --outbound-if4 NAME Bind to outbound interface for IPv4"); + info( " default: use interface from default route"); + info( " --outbound-if6 NAME Bind to outbound interface for IPv6"); + info( " default: use interface from default route"); info( " -D, --dns ADDR Use IPv4 or IPv6 address as DNS"); info( " can be specified multiple times"); info( " a single, empty option disables DNS information"); @@ -900,7 +907,7 @@ pasta_opts: */ static void conf_print(const struct ctx *c) { - char buf4[INET_ADDRSTRLEN], ifn[IFNAMSIZ]; + char buf4[INET_ADDRSTRLEN], buf6[INET6_ADDRSTRLEN], ifn[IFNAMSIZ]; int i; info("Template interface: %s%s%s%s%s", @@ -910,6 +917,26 @@ static void conf_print(const struct ctx *c) c->ifi6 ? if_indextoname(c->ifi6, ifn) : "", c->ifi6 ? " (IPv6)" : ""); + if (*c->ip4.ifname_out || *c->ip6.ifname_out) { + info("Outbound interface: %s%s%s%s%s", + *c->ip4.ifname_out ? c->ip4.ifname_out : "", + *c->ip4.ifname_out ? " (IPv4)" : "", + (*c->ip4.ifname_out && *c->ip6.ifname_out) ? ", " : "", + *c->ip6.ifname_out ? c->ip6.ifname_out : "", + *c->ip6.ifname_out ? " (IPv6)" : ""); + } + + if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) || + !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) { + info("Outbound address: %s%s%s", + IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) ? "" : + inet_ntop(AF_INET, &c->ip4.addr_out, buf4, sizeof(buf4)), + (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) && + !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) ? ", " : "", + IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) ? "" : + inet_ntop(AF_INET6, &c->ip6.addr_out, buf6, sizeof(buf6))); + } + if (c->mode == MODE_PASTA) info("Namespace interface: %s", c->pasta_ifn); @@ -948,8 +975,6 @@ static void conf_print(const struct ctx *c) } if (c->ifi6) { - char buf6[INET6_ADDRSTRLEN]; - if (!c->no_ndp && !c->no_dhcpv6) info("NDP/DHCPv6:"); else if (!c->no_ndp) @@ -1125,6 +1150,7 @@ void conf(struct ctx *c, int argc, char **argv) {"mac-addr", required_argument, NULL, 'M' }, {"gateway", required_argument, NULL, 'g' }, {"interface", required_argument, NULL, 'i' }, + {"outbound", required_argument, NULL, 'o' }, {"dns", required_argument, NULL, 'D' }, {"search", required_argument, NULL, 'S' }, {"no-tcp", no_argument, &c->no_tcp, 1 }, @@ -1157,6 +1183,8 @@ void conf(struct ctx *c, int argc, char **argv) {"runas", required_argument, NULL, 12 }, {"log-size", required_argument, NULL, 13 }, {"version", no_argument, NULL, 14 }, + {"outbound-if4", required_argument, NULL, 15 }, + {"outbound-if6", required_argument, NULL, 16 }, { 0 }, }; struct get_bound_ports_ns_arg ns_ports_arg = { .c = c }; @@ -1175,9 +1203,9 @@ void conf(struct ctx *c, int argc, char **argv) if (c->mode == MODE_PASTA) { c->no_dhcp_dns = c->no_dhcp_dns_search = 1; - optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; + optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:o:D:S:46t:u:T:U:"; } else { - optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:D:S:461t:u:"; + optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:o:D:S:461t:u:"; } c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0; @@ -1295,6 +1323,26 @@ void conf(struct ctx *c, int argc, char **argv) c->mode == MODE_PASST ? "passt " : "pasta "); fprintf(stdout, VERSION_BLOB); exit(EXIT_SUCCESS); + case 15: + if (*c->ip4.ifname_out) + die("Redundant outbound interface: %s", optarg); + + ret = snprintf(c->ip4.ifname_out, + sizeof(c->ip4.ifname_out), "%s", optarg); + if (ret <= 0 || ret >= (int)sizeof(c->ip4.ifname_out)) + die("Invalid interface name: %s", optarg); + + break; + case 16: + if (*c->ip6.ifname_out) + die("Redundant outbound interface: %s", optarg); + + ret = snprintf(c->ip6.ifname_out, + sizeof(c->ip6.ifname_out), "%s", optarg); + if (ret <= 0 || ret >= (int)sizeof(c->ip6.ifname_out)) + die("Invalid interface name: %s", optarg); + + break; case 'd': if (c->debug) die("Multiple --debug options given"); @@ -1466,6 +1514,26 @@ void conf(struct ctx *c, int argc, char **argv) die("Invalid interface name %s: %s", optarg, strerror(errno)); break; + case 'o': + if (IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) && + inet_pton(AF_INET6, optarg, &c->ip6.addr_out) && + !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) && + !IN6_IS_ADDR_LOOPBACK(&c->ip6.addr_out) && + !IN6_IS_ADDR_V4MAPPED(&c->ip6.addr_out) && + !IN6_IS_ADDR_V4COMPAT(&c->ip6.addr_out) && + !IN6_IS_ADDR_MULTICAST(&c->ip6.addr_out)) + break; + + if (IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) && + inet_pton(AF_INET, optarg, &c->ip4.addr_out) && + !IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) && + !IN4_IS_ADDR_BROADCAST(&c->ip4.addr_out) && + !IN4_IS_ADDR_MULTICAST(&c->ip4.addr_out)) + break; + + die("Invalid or redundant outbound address: %s", + optarg); + break; case 'D': if (!strcmp(optarg, "none")) { if (c->no_dns) diff --git a/icmp.c b/icmp.c index b842fa8..ddf83f8 100644 --- a/icmp.c +++ b/icmp.c @@ -170,8 +170,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr, iref.icmp.id = id = ntohs(ih->un.echo.id); if ((s = icmp_id_map[V4][id].sock) <= 0) { - s = sock_l4(c, AF_INET, IPPROTO_ICMP, NULL, NULL, id, - iref.u32); + const struct in_addr *bind_addr = NULL; + const char *bind_if; + + bind_if = *c->ip4.ifname_out ? c->ip4.ifname_out : NULL; + + if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)) + bind_addr = &c->ip4.addr_out; + + s = sock_l4(c, AF_INET, IPPROTO_ICMP, bind_addr, + bind_if, id, iref.u32); if (s < 0) goto fail_sock; if (s > SOCKET_MAX) { @@ -216,8 +224,16 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr, iref.icmp.id = id = ntohs(ih->icmp6_identifier); if ((s = icmp_id_map[V6][id].sock) <= 0) { - s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, NULL, NULL, id, - iref.u32); + const struct in6_addr *bind_addr = NULL; + const char *bind_if; + + bind_if = *c->ip6.ifname_out ? c->ip6.ifname_out : NULL; + + if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) + bind_addr = &c->ip6.addr_out; + + s = sock_l4(c, AF_INET6, IPPROTO_ICMPV6, bind_addr, + bind_if, id, iref.u32); if (s < 0) goto fail_sock; if (s > SOCKET_MAX) { diff --git a/passt.1 b/passt.1 index f317c33..5c97f30 100644 --- a/passt.1 +++ b/passt.1 @@ -183,6 +183,25 @@ Use host interface \fIname\fR to derive addresses and routes. Default is to use the interfaces with the first default routes for each IP version. +.TP +.BR \-o ", " \-\-outbound " " \fIaddr +Use an IPv4 \fIaddr\fR as source address for IPv4 outbound TCP connections, UDP +flows, ICMP requests, or an IPv6 \fIaddr\fR for IPv6 ones, by binding outbound +sockets to it. +This option can be specified zero (for defaults) to two times (once for IPv4, +once for IPv6). +By default, the source address is selected by the routing tables. + +.TP +.BR \-\-outbound-if4 " " \fIname +Bind IPv4 outbound sockets to host interface \fIname\fR. +By default, the interface given by the default route is selected. + +.TP +.BR \-\-outbound-if6 " " \fIname +Bind IPv6 outbound sockets to host interface \fIname\fR. +By default, the interface given by the default route is selected. + .TP .BR \-D ", " \-\-dns " " \fIaddr Use \fIaddr\fR (IPv4 or IPv6) for DHCP, DHCPv6, NDP or DNS forwarding, as diff --git a/passt.h b/passt.h index cc60c84..b73f4ff 100644 --- a/passt.h +++ b/passt.h @@ -106,6 +106,8 @@ enum passt_modes { * @dns: DNS addresses for DHCP, zero-terminated, network order * @dns_match: Forward DNS query if sent to this address, network order * @dns_host: Use this DNS on the host for forwarding, network order + * @addr_out: Optional source address for outbound traffic + * @ifname_out: Optional interface name to bind outbound sockets to */ struct ip4_ctx { struct in_addr addr; @@ -115,6 +117,9 @@ struct ip4_ctx { struct in_addr dns[MAXNS + 1]; struct in_addr dns_match; struct in_addr dns_host; + + struct in_addr addr_out; + char ifname_out[IFNAMSIZ]; }; /** @@ -127,6 +132,8 @@ struct ip4_ctx { * @dns: DNS addresses for DHCPv6 and NDP, zero-terminated * @dns_match: Forward DNS query if sent to this address * @dns_host: Use this DNS on the host for forwarding + * @addr_out: Optional source address for outbound traffic + * @ifname_out: Optional interface name to bind outbound sockets to */ struct ip6_ctx { struct in6_addr addr; @@ -137,6 +144,9 @@ struct ip6_ctx { struct in6_addr dns[MAXNS + 1]; struct in6_addr dns_match; struct in6_addr dns_host; + + struct in6_addr addr_out; + char ifname_out[IFNAMSIZ]; }; #include <netinet/if_ether.h> diff --git a/tcp.c b/tcp.c index b674311..8e8d653 100644 --- a/tcp.c +++ b/tcp.c @@ -1946,6 +1946,61 @@ static uint16_t tcp_conn_tap_mss(const struct ctx *c, return MIN(mss, USHRT_MAX); } +/** + * tcp_bind_outbound() - Bind socket to outbound address and interface if given + * @c: Execution context + * @s: Outbound TCP socket + * @af: Address family + */ +static void tcp_bind_outbound(const struct ctx *c, int s, sa_family_t af) +{ + if (af == AF_INET) { + if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out)) { + struct sockaddr_in addr4 = { + .sin_family = AF_INET, + .sin_port = 0, + .sin_addr = c->ip4.addr_out, + }; + + if (bind(s, (struct sockaddr *)&addr4, sizeof(addr4))) { + debug("Can't bind IPv4 TCP socket address: %s", + strerror(errno)); + } + } + + if (*c->ip4.ifname_out) { + if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, + c->ip4.ifname_out, + strlen(c->ip4.ifname_out))) { + debug("Can't bind IPv4 TCP socket to interface:" + " %s", strerror(errno)); + } + } + } else if (af == AF_INET6) { + if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out)) { + struct sockaddr_in6 addr6 = { + .sin6_family = AF_INET6, + .sin6_port = 0, + .sin6_addr = c->ip6.addr_out, + }; + + if (bind(s, (struct sockaddr *)&addr6, sizeof(addr6))) { + debug("Can't bind IPv6 TCP socket address: %s", + strerror(errno)); + } + } + + if (*c->ip6.ifname_out) { + if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, + c->ip6.ifname_out, + strlen(c->ip6.ifname_out))) { + debug("Can't bind IPv6 TCP socket to interface:" + " %s", strerror(errno)); + } + } + } +} + /** * tcp_conn_from_tap() - Handle connection request (SYN segment) from tap * @c: Execution context @@ -2052,6 +2107,11 @@ static void tcp_conn_from_tap(struct ctx *c, int af, const void *addr, if (errno != EADDRNOTAVAIL && errno != EACCES) conn_flag(c, conn, LOCAL); + if ((af == AF_INET && !IN4_IS_ADDR_LOOPBACK(&addr4.sin_addr)) || + (af == AF_INET6 && !IN6_IS_ADDR_LOOPBACK(&addr6.sin6_addr) && + !IN6_IS_ADDR_LINKLOCAL(&addr6.sin6_addr))) + tcp_bind_outbound(c, s, af); + if (connect(s, sa, sl)) { if (errno != EINPROGRESS) { tcp_rst(c, conn); diff --git a/udp.c b/udp.c index b7cbfdc..ef486fe 100644 --- a/udp.c +++ b/udp.c @@ -843,20 +843,6 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, sa = (struct sockaddr *)&s_in; sl = sizeof(s_in); - if (!(s = udp_tap_map[V4][src].sock)) { - union udp_epoll_ref uref = { .udp.port = src }; - - s = sock_l4(c, AF_INET, IPPROTO_UDP, NULL, NULL, src, - uref.u32); - if (s < 0) - return p->count; - - udp_tap_map[V4][src].sock = s; - bitmap_set(udp_act[V4][UDP_ACT_TAP], src); - } - - udp_tap_map[V4][src].ts = now->tv_sec; - if (IN4_ARE_ADDR_EQUAL(&s_in.sin_addr, &c->ip4.dns_match) && ntohs(s_in.sin_port) == 53) { s_in.sin_addr = c->ip4.dns_host; @@ -868,13 +854,37 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, else s_in.sin_addr = c->ip4.addr_seen; } + + if (!(s = udp_tap_map[V4][src].sock)) { + union udp_epoll_ref uref = { .udp.port = src }; + in_addr_t bind_addr = { 0 }; + const char *bind_if = NULL; + + if (!IN6_IS_ADDR_LOOPBACK(&s_in.sin_addr) && + *c->ip6.ifname_out) + bind_if = c->ip6.ifname_out; + + if (!IN4_IS_ADDR_UNSPECIFIED(&c->ip4.addr_out) && + !IN4_IS_ADDR_LOOPBACK(&s_in.sin_addr)) + bind_addr = c->ip4.addr_out.s_addr; + + s = sock_l4(c, AF_INET, IPPROTO_UDP, &bind_addr, + bind_if, src, uref.u32); + if (s < 0) + return p->count; + + udp_tap_map[V4][src].sock = s; + bitmap_set(udp_act[V4][UDP_ACT_TAP], src); + } + + udp_tap_map[V4][src].ts = now->tv_sec; } else { s_in6 = (struct sockaddr_in6) { .sin6_family = AF_INET6, .sin6_port = uh->dest, .sin6_addr = *(struct in6_addr *)addr, }; - const void *bind_addr = &in6addr_any; + const struct in6_addr *bind_addr = &in6addr_any; sa = (struct sockaddr *)&s_in6; sl = sizeof(s_in6); @@ -898,9 +908,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, if (!(s = udp_tap_map[V6][src].sock)) { union udp_epoll_ref uref = { .udp.v6 = 1, .udp.port = src }; + const char *bind_if = NULL; + + if (!IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) && + *c->ip6.ifname_out) + bind_if = c->ip6.ifname_out; + + if (!IN6_IS_ADDR_UNSPECIFIED(&c->ip6.addr_out) && + !IN6_IS_ADDR_LOOPBACK(&s_in6.sin6_addr) && + !IN6_IS_ADDR_LINKLOCAL(&s_in6.sin6_addr)) + bind_addr = &c->ip6.addr_out; - s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, NULL, - src, uref.u32); + s = sock_l4(c, AF_INET6, IPPROTO_UDP, bind_addr, + bind_if, src, uref.u32); if (s < 0) return p->count; -- 2.39.2
On Wed, Mar 08, 2023 at 11:42:29PM +0100, Stefano Brivio wrote:v3: In 2/2, *really* pass the correct bind_addr reference to sock_l4() in udp_tap_handler() -- I just sent v2 from the wrong directory :/Heh. My remaining comments from v1 still apply.Stefano Brivio (2): conf, passt.h: Rename "outbound" interface to "template" interface conf, icmp, tcp, udp: Add options to bind to outbound address and interface conf.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++------ icmp.c | 24 +++++++++++++--- passt.1 | 19 ++++++++++++ passt.h | 14 +++++++-- tcp.c | 60 ++++++++++++++++++++++++++++++++++++++ udp.c | 54 +++++++++++++++++++++++----------- 6 files changed, 228 insertions(+), 32 deletions(-)-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson