On Thu, 28 Dec 2023 13:42:25 +1100
David Gibson <david(a)gibson.dropbear.id.au> wrote:
On Wed, Dec 27, 2023 at 09:25:06PM +0100, Stefano
Brivio wrote:
On Fri, 8 Dec 2023 01:31:33 +1100
David Gibson <david(a)gibson.dropbear.id.au> wrote:
> This takes a struct in_addr * (i.e. an IPv4 address), although it's
> explicitly supposed to handle IPv6 as well. Both its caller and sock_l4()
> which it calls use a void * for the address, which can be either an in_addr
> or an in6_addr.
>
> We get away with this, because we don't do anything with the pointer other
> than transfer it from the caller to sock_l4(), but it's misleading. And
> quite possibly technically UB, because C is like that.
>
> Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au>
> ---
> tcp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tcp.c b/tcp.c
> index f506cfd..bda95b2 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -2905,7 +2905,7 @@ void tcp_sock_handler(struct ctx *c, union epoll_ref ref,
uint32_t events)
> * Return: fd for the new listening socket, negative error code on failure
> */
> static int tcp_sock_init_af(const struct ctx *c, int af, in_port_t port,
> - const struct in_addr *addr, const char *ifname)
> + const void *addr, const char *ifname)
This is obviously correct.
However, after a lot of thinking: (gcc) optimisations based on
Type-Based Alias Analysis, which we don't disable on this path, could,
now, happily defer filling 'addr' with inet_pton() in conf_ports() to a
point *after* the tcp_sock_init() call.
Hrm... possibly. The fact that the addr variable in conf_ports() is a
char array, not a struct in*_addr might save us.
Hmm, look at the commit message for a48c5c2abf8a ("treewide: Disable
gcc strict aliasing rules as needed, drop workarounds"): that didn't help
with the checksum functions, because yes, at some point I had char *, but
then I used those as different types.
I guess struct in_addr / struct in6_addr as we have in sock_l4() might be
equivalent to that.
I think replacing it
with a union of an in_addr and in6_addr would also be ok.
That should work, yes, and that's what I originally wanted to suggest,
before remembering about union inany_addr... but that doesn't fit, see
below.
Without
this patch, at least 32 bits must be updated before the call.
I'm not sure that's correct. If the compiler is allowed to assume
that a char[] and a void * aren't aliased (even if they clearly are),
then I'd expect it to also be allowed to assume that a char[] and a
struct in_addr * aren't aliased.
Ouch, right, they aren't (again... sarcastically speaking).
It might
sound like a joke because... it actually is. But look at what
we had to do for the functions in checksum.c. We pass const void *buf,
and anything that buf points to can be updated (with TBAA) after the
call.
I don't see any conceptual difference between this case and those
functions.
Anyway, that won't reasonably happen here, and in any case this would
have been broken for IPv6, so I'll go ahead and apply this.
But, eventually, I think we should switch all these usages to union
inany_addr *.
So, we may be able to use union inany_addr in some places, but that's
not the same thing as this: inany_addr carries IPv4 addresses as
mapped IPv6 addresses, it's not switched on a separate af parameter.
I really meant *a pointer* to union inany_addr, that is:
We could, of course, define a new type as a
simple union of in_addr
and in6_addr.
...abusing it instead of using a separate union. On the other hand,
given where 'a4' is in there, it's not necessarily the same for
(strict) aliasing considerations.
Is "union in10_addr" fashionable enough? We could use A [16], but it's
inconvenient to type, and difficult to pronounce.
Uh, I haven't heard of either of those.