If the template host interface is of type tun, and it's configured with a point-to-point peer address (that's what happens for example with openvpn and '--topology net30'), pasta will copy the peer information onto the namespace interface. But the namespace interface is not actually a point-to-point tunnel, and we won't resolve the peer address via ARP either, so we have to drop this information to get the expected behaviour (traffic regularly sent over our tap interface). Link: https://github.com/containers/podman/issues/22320 Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- netlink.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/netlink.c b/netlink.c index 89c0641..73aaa4b 100644 --- a/netlink.c +++ b/netlink.c @@ -792,8 +792,8 @@ int nl_addr_dup(int s_src, unsigned int ifi_src, seq = nl_send(s_src, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req)); nl_foreach_oftype(nh, status, s_src, buf, seq, RTM_NEWADDR) { + struct rtattr *rta, *rta_local = NULL; struct ifaddrmsg *ifa; - struct rtattr *rta; size_t na; ifa = (struct ifaddrmsg *)NLMSG_DATA(nh); @@ -804,12 +804,33 @@ int nl_addr_dup(int s_src, unsigned int ifi_src, ifa->ifa_index = ifi_dst; + for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na); + rta = RTA_NEXT(rta, na)) { + if (rta->rta_type == IFA_LOCAL) { + rta_local = rta; + break; + } + } + for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na); rta = RTA_NEXT(rta, na)) { /* Strip label and expiry (cacheinfo) information */ if (rta->rta_type == IFA_LABEL || rta->rta_type == IFA_CACHEINFO) rta->rta_type = IFA_UNSPEC; + + /* Different values for IFA_ADDRESS and IFA_LOCAL mean + * that IFA_LOCAL is the locally configured address, and + * IFA_ADDRESS is the peer address for a point-to-point + * interface. But our namespace interface isn't really a + * point-to-point tunnel, and we can't resolve that peer + * address via ARP: simply drop it, and keep the local + * address. + */ + if (rta->rta_type == IFA_ADDRESS && rta_local) { + memcpy(RTA_DATA(rta), RTA_DATA(rta_local), + RTA_PAYLOAD(rta)); + } } rc = nl_do(s_dst, nh, RTM_NEWADDR, -- 2.43.0
On Fri, Apr 12, 2024 at 12:18:00AM +0200, Stefano Brivio wrote:If the template host interface is of type tun, and it's configured with a point-to-point peer address (that's what happens for example with openvpn and '--topology net30'), pasta will copy the peer information onto the namespace interface. But the namespace interface is not actually a point-to-point tunnel, and we won't resolve the peer address via ARP either, so we have to drop this information to get the expected behaviour (traffic regularly sent over our tap interface). Link: https://github.com/containers/podman/issues/22320 Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- netlink.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/netlink.c b/netlink.c index 89c0641..73aaa4b 100644 --- a/netlink.c +++ b/netlink.c @@ -792,8 +792,8 @@ int nl_addr_dup(int s_src, unsigned int ifi_src, seq = nl_send(s_src, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req)); nl_foreach_oftype(nh, status, s_src, buf, seq, RTM_NEWADDR) { + struct rtattr *rta, *rta_local = NULL; struct ifaddrmsg *ifa; - struct rtattr *rta; size_t na; ifa = (struct ifaddrmsg *)NLMSG_DATA(nh); @@ -804,12 +804,33 @@ int nl_addr_dup(int s_src, unsigned int ifi_src, ifa->ifa_index = ifi_dst; + for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na); + rta = RTA_NEXT(rta, na)) { + if (rta->rta_type == IFA_LOCAL) { + rta_local = rta; + break; + } + } + for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na); rta = RTA_NEXT(rta, na)) { /* Strip label and expiry (cacheinfo) information */ if (rta->rta_type == IFA_LABEL || rta->rta_type == IFA_CACHEINFO) rta->rta_type = IFA_UNSPEC; + + /* Different values for IFA_ADDRESS and IFA_LOCAL mean + * that IFA_LOCAL is the locally configured address, and + * IFA_ADDRESS is the peer address for a point-to-point + * interface. But our namespace interface isn't really a + * point-to-point tunnel, and we can't resolve that peer + * address via ARP: simply drop it, and keep the local + * address.Could we just unconditionally remove IFA_ADDRESS properties (by setting them to IFA_UNSPEC)? That we we could avoid having two passes through the attributes.+ */ + if (rta->rta_type == IFA_ADDRESS && rta_local) { + memcpy(RTA_DATA(rta), RTA_DATA(rta_local), + RTA_PAYLOAD(rta)); + } } rc = nl_do(s_dst, nh, RTM_NEWADDR,-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Tue, 23 Apr 2024 11:02:43 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Fri, Apr 12, 2024 at 12:18:00AM +0200, Stefano Brivio wrote:Ah, thanks, that sounds better, but I haven't tried it yet. By the way, this patch doesn't fix the issue: https://github.com/containers/podman/issues/22320#issuecomment-2051279807 so I think we need something on top of this, but I'm not sure yet what. Other than tweaking routes, another idea might be to adjust the netmask here. -- StefanoIf the template host interface is of type tun, and it's configured with a point-to-point peer address (that's what happens for example with openvpn and '--topology net30'), pasta will copy the peer information onto the namespace interface. But the namespace interface is not actually a point-to-point tunnel, and we won't resolve the peer address via ARP either, so we have to drop this information to get the expected behaviour (traffic regularly sent over our tap interface). Link: https://github.com/containers/podman/issues/22320 Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- netlink.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/netlink.c b/netlink.c index 89c0641..73aaa4b 100644 --- a/netlink.c +++ b/netlink.c @@ -792,8 +792,8 @@ int nl_addr_dup(int s_src, unsigned int ifi_src, seq = nl_send(s_src, &req, RTM_GETADDR, NLM_F_DUMP, sizeof(req)); nl_foreach_oftype(nh, status, s_src, buf, seq, RTM_NEWADDR) { + struct rtattr *rta, *rta_local = NULL; struct ifaddrmsg *ifa; - struct rtattr *rta; size_t na; ifa = (struct ifaddrmsg *)NLMSG_DATA(nh); @@ -804,12 +804,33 @@ int nl_addr_dup(int s_src, unsigned int ifi_src, ifa->ifa_index = ifi_dst; + for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na); + rta = RTA_NEXT(rta, na)) { + if (rta->rta_type == IFA_LOCAL) { + rta_local = rta; + break; + } + } + for (rta = IFA_RTA(ifa), na = IFA_PAYLOAD(nh); RTA_OK(rta, na); rta = RTA_NEXT(rta, na)) { /* Strip label and expiry (cacheinfo) information */ if (rta->rta_type == IFA_LABEL || rta->rta_type == IFA_CACHEINFO) rta->rta_type = IFA_UNSPEC; + + /* Different values for IFA_ADDRESS and IFA_LOCAL mean + * that IFA_LOCAL is the locally configured address, and + * IFA_ADDRESS is the peer address for a point-to-point + * interface. But our namespace interface isn't really a + * point-to-point tunnel, and we can't resolve that peer + * address via ARP: simply drop it, and keep the local + * address.Could we just unconditionally remove IFA_ADDRESS properties (by setting them to IFA_UNSPEC)? That we we could avoid having two passes through the attributes.