On Sun, 28 May 2023 16:27:13 +0000 Juan Orti <jorti(a)pm.me> wrote:------- Original Message ------- El domingo, 28 de mayo de 2023 a las 16:38, Stefano Brivio <sbrivio(a)redhat.com> escribió:Hmm, it depends: https://passt.top/passt/tree/udp.c?id=e3b19530e4a689f9f8e417ebf737dfca23403… I'm not sure what's the original source address of our DNS query (you can find that out with tcpdump in the parent namespace). For example, if it's a loopback address, we go ahead and try to convert both source and destination address to our notion of (observed) link-local addresses, because we can't use a loopback address on a non-loopback interface (non-lo in the container). But I guess in this case it's not a loopback address: the default gateway address, copied to the container, is fe80::ea9f:80ff:fe5d:3d6e, which is a link-local address, but we don't use it, so I assume we end up either in the IN6_IS_ADDR_LINKLOCAL(src) condition, or in the final 'else' clause. At that point, the address we've seen the guest using becomes our destination address. It can even be a link-local address if we haven't observed a unicast address used, yet. It would be interesting to see what happens if you generate traffic, from the container, coming from fddc:f797:78ef:70::5, before a DNS query is sent (a TCP request via IPv6 should be enough). I'm not swearing on the correctness of this logic, it's a result of handling several corner cases, it's rather ugly at the moment, and David is currently considering how to clean that up. By the way, this might also happen to be "fixed" on HEAD, as there we copy all the addresses and all the routes, by default, from the parent namespace to the container namespace.I guess that might come from the IPV6_PKTINFO ancillary data (cmsg_type 0x32) -- I'm not sure how and why it's used here as strace doesn't dump the CMSG_DATA content, but, having a look at ip6_datagram_send_ctl() (net/ipv6/datagram.c), EINVAL might come from: 1. a link-local address being passed along... I doubt that's the case 2. a non-local address (or one we can't bind to anyway) being used. To check if we're in this case, it would be helpful if you could share the addressing information from the container (ip -6 address show), and if you could try 'sysctl -w net.ipv6.ip_nonlocal_bind = 1', again from the container.net.ipv6.ip_nonlocal_bind=1 is not helping. This is the container network config: # ip -6 address show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 state UNKNOWN qlen 1000 inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp88s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 state UNKNOWN qlen 1000 inet6 fddc:f797:78ef:70::5/64 scope global flags 02 valid_lft forever preferred_lft forever inet6 fe80::5cef:4eff:fe6c:551f/64 scope link valid_lft forever preferred_lft forever # ip -6 r show table all fddc:f797:78ef:70::/64 dev enp88s0 metric 256 fe80::/64 dev enp88s0 metric 256 default via fe80::ea9f:80ff:fe5d:3d6e dev enp88s0 metric 1024 local ::1 dev lo table local metric 0 local fddc:f797:78ef:70::5 dev enp88s0 table local metric 0 local fe80::5cef:4eff:fe6c:551f dev enp88s0 table local metric 0 multicast ff00::/8 dev enp88s0 table local metric 256 With a tcpdump inside the container I can see that the incoming packets are actually arriving with the link-local address as the destination (is this expected?). 16:18:26.248659 IP6 (hlim 255, next-header UDP (17) payload length: 63) fddc:f797:78ef:10::b46.42091 > fe80::5cef:4eff:fe6c:551f.53: [udp sum ok] 6215+ [1au] A? www.google.com. (55)16:18:31.253942 IP6 (hlim 255, next-header UDP (17) payload length: 63) fddc:f797:78ef:10::b46.34965 > fe80::5cef:4eff:fe6c:551f.53: [udp sum ok] 6215+ [1au] A? www.google.com. (55) 16:18:36.257294 IP6 (hlim 255, next-header UDP (17) payload length: 63) fddc:f797:78ef:10::b46.55302 > fe80::5cef:4eff:fe6c:551f.53: [udp sum ok] 6215+ [1au] A? www.google.com. (55) TCP also uses the link-local address, however it works:...yes, as far as I know there are no normative references preventing a non-link-local address from contacting a link-local one. This just happens to be a problem because AdguardHome uses IPV6_PKTINFO, with that same address I guess, in its sendmsg(), and for some reason I didn't really investigate that leads to EINVAL on Linux, but it looks like an implementation detail (specific to UDP) to me. -- Stefano