On Wed, Dec 14, 2022 at 11:35:54AM +0100, Stefano Brivio wrote:On Wed, 14 Dec 2022 12:47:25 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:I think the int is less ugly than the pointer. Ok, I'll make that change.On Tue, Dec 13, 2022 at 11:49:18PM +0100, Stefano Brivio wrote:Acceptable, sure, but... I don't know, it somehow doesn't look desirable to me. The kernel doesn't enforce this, so I guess we shouldn't either.On Mon, 5 Dec 2022 19:14:24 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Hm, ok. Given the IANA reservation, I think it would be acceptable to simply drop such packets - but if we were to make that choice we should do so explicitly, rather than misdirecting them.Currently we have special sockets for receiving datagrams from locahost which can use the optimized "splice" path rather than going across the tap interface. We want to loosen this so that sockets can receive sockets that will be forwarded by both the spliced and non-spliced paths. To do this, we alter the meaning of the @splice bit in the reference to mean that packets receieved on this socket *can* be spliced, not that they *will* be spliced. They'll only actually be spliced if they come from 127.0.0.1 or ::1. We can't (for now) remove the splice bit entirely, unlike with TCP. Our gateway mapping means that if the ns initiates communication to the gw address, we'll translate that to target 127.0.0.1 on the host side. Reply packets will therefore have source address 127.0.0.1 when received on the host, but these need to go via the tap path where that will be translated back to the gateway address. We need the @splice bit to distinguish that case from packets going from localhost to a port mapped explicitly with -u which should be spliced. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- udp.c | 54 +++++++++++++++++++++++++++++++++++------------------- udp.h | 2 +- 2 files changed, 36 insertions(+), 20 deletions(-) diff --git a/udp.c b/udp.c index 6ccfe8c..011a157 100644 --- a/udp.c +++ b/udp.c @@ -513,16 +513,27 @@ static int udp_splice_new_ns(void *arg) } /** - * sa_port() - Determine port from a sockaddr_in or sockaddr_in6 + * udp_mmh_splice_port() - Is source address of message suitable for splicing? * @v6: Is @sa a sockaddr_in6 (otherwise sockaddr_in)? - * @sa: Pointer to either sockaddr_in or sockaddr_in6 + * @mmh: mmsghdr of incoming message + * + * Return: if @sa refers to localhost (127.0.0.1 or ::1) the port from + * @sa, otherwise 0. + * + * NOTE: this relies on the fact that it's not valid to use UDP port 0The port is reserved by IANA indeed, but... it can actually be used. On Linux, you can bind() it and you can connect() to it. As far as I can tell from the new version of udp_sock_handler() we would actually misdirect packets in that case.Eh, I don't like it either, but... I guess it's better than the alternative, so yes, thanks. Or pass port as a pointer, set on return. I'm fine with both.How bad would it be to use an int here?Pretty straightforward. Just means we have to use the somewhat abtruse "if (port <= USHRT_MAX)" or "if (port >= 0)" or something instead of just "if (port)". Should I go ahead and make that change?Right. Actually rereading the function in question, it specifically says "port from @sa" and in the sockaddr, of course, it's network endian, so it could particularly do with clarification in this case.Yes, I see, and it's a more valid approach than mine, but still mine comes almost for free.By the way, I think the comment should also mention that the port is returned in host order.Ok, easily done. Generally I try to keep the endianness associated with the type, rather than attempting to document it for each variable (or even worse, each point in time for each variable).By the way, I got distracted by this and I forgot about two things:Fixed.+static in_port_t udp_mmh_splice_port(bool v6, const struct mmsghdr *mmh) { - const struct sockaddr_in6 *sa6 = sa; - const struct sockaddr_in *sa4 = sa; + const struct sockaddr_in6 *sa6 = mmh->msg_hdr.msg_name; + const struct sockaddr_in *sa4 = mmh->msg_hdr.msg_name;;Stray semicolon here.Oops, yes, that's definitely wrong, and wrong enough to respin.+ + if (v6 && IN6_IS_ADDR_LOOPBACK(&sa6->sin6_addr)) + return ntohs(sa6->sin6_port); - return v6 ? ntohs(sa6->sin6_port) : ntohs(sa4->sin_port); + if (ntohl(sa4->sin_addr.s_addr) == INADDR_LOOPBACK) + return ntohs(sa4->sin_port);If it's IPv6, but not a loopback address, we'll check if sa4->sin_addr.s_addr == INADDR_LOOPBACK -- which might actually be true for an IPv6, non-loopback address.Also, I think we can happily "splice" for any loopback address, not just 127.0.0.1. What about something like:Well.. yes and no. Yes, we can deliver packets in this way, but we'll lost track of the original from address, so reply packets will be misdirected. Hrm.. ISTR you mentioned some cases that worked despite that, so I'll make the change, and hope to fix it up better when I get to the NAT / splice semantics rework.if (v6 && IN6_IS_ADDR_LOOPBACK(&sa6->sin6_addr)) return ntohs(sa6->sin6_port); if (!v4 && IN4_IS_ADDR_LOOPBACK(&sa4->sin_addr)) return ntohs(sa4->sin_port); return -1; ?-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson