On Tue, Jul 19, 2022 at 09:05:52PM +0200, Stefano Brivio wrote:On Tue, 19 Jul 2022 16:20:45 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:Yes, there are two different metrics. I was thinking after I sent this that we should sort by metric.On Fri, Jul 15, 2022 at 03:21:40PM +1000, David Gibson wrote:I don't see that happening at least in my environment (and I also can't see any code that would sort it, that pretty much comes from rtnl_dump_filter_l() in lib/libnetlink.c). The netlink "filter", though, is slightly different. For IPv4: $ strace -e sendto ip route show >/dev/null sendto(3, [{{len=36, type=RTM_GETROUTE, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1658257027, pid=0}, {rtm_family=AF_INET, rtm_dst_len=0, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_UNSPEC, rtm_protocol=RTPROT_UNSPEC, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNSPEC, rtm_flags=0}, {{nla_len=8, nla_type=RTA_TABLE}, RT_TABLE_MAIN}}, {len=0, type=0 /* NLMSG_??? */, flags=0, seq=0, pid=0}], 156, 0, NULL, 0) = 156 $ strace -e sendto ./passt -f sendto(5, {{len=28, type=RTM_GETROUTE, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=0, pid=0}, {rtm_family=AF_INET, rtm_dst_len=0, rtm_src_len=0, rtm_tos=0, rtm_table=RT_TABLE_MAIN, rtm_protocol=RTPROT_UNSPEC, rtm_scope=RT_SCOPE_UNIVERSE, rtm_type=RTN_UNICAST, rtm_flags=0}}, 28, 0, NULL, 0) = 28 [...] and, while I don't think the FIB trie is actually descended in a different way, we might still have a slightly different result from the kernel (if I recall correctly, I didn't check right now). But letting that aside for a moment: if you have two default routes, I suppose they have different metrics. If not, what's the intended usage?By default, passt itself attaches to the first host interface with a default route. However, when determining the host interface name the tests implicitly select the *last* host interface: they use a jq expression which will list all interfaces with default routes, but the way output detection works in the scripts, it will only pick up the last line. If there are multiple interfaces with default routes on the host, and they each have a different address, this can cause spurious test failures.It seems this change is not enough to always fix the tests when there are multiple default routes. I'm still sometimes getting failures, now because passt itself doesn't seem to be picking the interface with the first default route. I'm wondering if this is because ip(8) is sorting the output, not just presenting it in the same order that the underlying netlink interface does.If yes, we should probably implement a sorting logic in passt, so that the route with the lowest metric is picked, and then adjust the jq expression to also pick that one.Agreed. I'll see if I can implement this. Up to you whether you drop this patch or I just do another update on top of it. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson