On Tue, 27 Aug 2024 11:12:46 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Mon, Aug 26, 2024 at 09:32:55PM +0200, Stefano Brivio wrote:I still don't see how: the second assignment (out of three) is done before i is incremented, so that should cover i == 0 as well, right?On Mon, 26 Aug 2024 19:37:14 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:It's not redundant per se, because the later assignment only occurs for i > 0, so the first one is for slot 0.We've already gotten rid of most of the IPv4/IPv6 specific data structures in udp.c by merging them with each other. One significant one remains: udp[46]_mh_recv. This was a bit awkward to remove because of a subtle interaction. We initialise the msg_namelen fields to represent the total size we have for a socket address, but when we receive into the arrays those are modified to the actual length of the sockaddr we received. That meant that naively merging the arrays meant that if we received IPv4 datagrams, then IPv6 datagrams, the addresses for the latter would be truncated. In this patch address that by resetting the received msg_namelen as soon as we've found a flow for the datagram. Finding the flow is the only thing that might use the actual sockaddr length, although we in fact don't need it for the time being. This also removes the last use of the 'v6' field from udp_listen_epoll_ref, so remove that as well. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- udp.c | 57 ++++++++++++++++++++------------------------------------- udp.h | 2 -- 2 files changed, 20 insertions(+), 39 deletions(-) diff --git a/udp.c b/udp.c index 8a93aad6..6638c22b 100644 --- a/udp.c +++ b/udp.c @@ -178,8 +178,7 @@ enum udp_iov_idx { /* IOVs and msghdr arrays for receiving datagrams from sockets */ static struct iovec udp_iov_recv [UDP_MAX_FRAMES]; -static struct mmsghdr udp4_mh_recv [UDP_MAX_FRAMES]; -static struct mmsghdr udp6_mh_recv [UDP_MAX_FRAMES]; +static struct mmsghdr udp_mh_recv [UDP_MAX_FRAMES]; /* IOVs and msghdr arrays for sending "spliced" datagrams to sockets */ static union sockaddr_inany udp_splice_to; @@ -222,6 +221,7 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s) static void udp_iov_init_one(const struct ctx *c, size_t i) { struct udp_payload_t *payload = &udp_payload[i]; + struct msghdr *mh = &udp_mh_recv[i].msg_hdr; struct udp_meta_t *meta = &udp_meta[i]; struct iovec *siov = &udp_iov_recv[i]; struct iovec *tiov = udp_l2_iov[i]; @@ -236,27 +236,10 @@ static void udp_iov_init_one(const struct ctx *c, size_t i) tiov[UDP_IOV_TAP] = tap_hdr_iov(c, &meta->taph); tiov[UDP_IOV_PAYLOAD].iov_base = payload; - /* It's useful to have separate msghdr arrays for receiving. Otherwise, - * an IPv4 recv() will alter msg_namelen, so we'd have to reset it every - * time or risk truncating the address on future IPv6 recv()s. - */ - if (c->ifi4) { - struct msghdr *mh = &udp4_mh_recv[i].msg_hdr; - - mh->msg_name = &meta->s_in; - mh->msg_namelen = sizeof(struct sockaddr_in); - mh->msg_iov = siov; - mh->msg_iovlen = 1; - } - - if (c->ifi6) { - struct msghdr *mh = &udp6_mh_recv[i].msg_hdr; - - mh->msg_name = &meta->s_in; - mh->msg_namelen = sizeof(struct sockaddr_in6); - mh->msg_iov = siov; - mh->msg_iovlen = 1; - } + mh->msg_name = &meta->s_in; + mh->msg_namelen = sizeof(meta->s_in); + mh->msg_iov = siov; + mh->msg_iovlen = 1; } /** @@ -506,10 +489,10 @@ static int udp_sock_recv(const struct ctx *c, int s, uint32_t events, void udp_listen_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events, const struct timespec *now) { - struct mmsghdr *mmh_recv = ref.udp.v6 ? udp6_mh_recv : udp4_mh_recv; + const socklen_t sasize = sizeof(udp_meta[0].s_in); int n, i; - if ((n = udp_sock_recv(c, ref.fd, events, mmh_recv)) <= 0) + if ((n = udp_sock_recv(c, ref.fd, events, udp_mh_recv)) <= 0) return; /* We divide datagrams into batches based on how we need to send them, @@ -518,6 +501,7 @@ void udp_listen_sock_handler(const struct ctx *c, union epoll_ref ref, * populate it one entry *ahead* of the loop counter. */ udp_meta[0].tosidx = udp_flow_from_sock(c, ref, &udp_meta[0].s_in, now); + udp_mh_recv[0].msg_hdr.msg_namelen = sasize;I don't understand why you need this assignment. To me it looks redundant with: udp_mh_recv[i].msg_hdr.msg_namelen = sizeof(udp_meta[i].s_in);It would, however, be possible to move to a single assignment in the loop body before i is incremented. I did it this way, because I found it easier to reason about. At least theoretically the value of msg_namelen written by recvmmsg() could be important, although we don't use yet (we rely on the sa_family field instead). But because of that it felt wrong to overwrite that value before we've "consumed" it. Logically that happens in udp_flow_from_sock() which is what takes the address in msg_name / msg_namelen and converts it into the long-term form (as part of the flowside). Hence, clearing msg_namelen immediately after each call to udp_flow_from_sock() made sense to me. I did consider changing udp_flow_from_sock() to take a socklen_t * which it clears after using. That seemed slightly abstraction violationy to me: clearing msg_namelen only makes sense because the address is part of a re-used mmsghdr array, and that's not something udp_flow_from_sock() "knows". That was my reasoning, anyway. I'm happy enough to change it if you have a preferred approach.No, no, this all makes sense. But you add three assignments here, and I don't understand why #1 is needed if we have #2 and #3, or why #2 is needed if we have #1 and #3.Right, but why do you use it just twice out of three assignments? What is special with the one immediately above here?later (because n > 0), and:It is. The only purpose of sasize is to avoid some over-long lines.for (i = 0; i < n; ) { flow_sidx_t batchsidx = udp_meta[i].tosidx; uint8_t batchpif = pif_at_sidx(batchsidx); @@ -525,18 +509,22 @@ void udp_listen_sock_handler(const struct ctx *c, union epoll_ref ref, do { if (pif_is_socket(batchpif)) { - udp_splice_prepare(mmh_recv, i); + udp_splice_prepare(udp_mh_recv, i); } else if (batchpif == PIF_TAP) { - udp_tap_prepare(mmh_recv, i, + udp_tap_prepare(udp_mh_recv, i, flowside_at_sidx(batchsidx)); } + /* Restore sockaddr length clobbered by recvmsg() */ + udp_mh_recv[i].msg_hdr.msg_namelen = sizeof(udp_meta[i].s_in);what is the difference between assigning sizeof(udp_meta[i].s_in); and sasize? I thought it would be the same quantity.> > + > > if (++i >= n) > > break; > > > > udp_meta[i].tosidx = udp_flow_from_sock(c, ref, > > &udp_meta[i].s_in, > > now); > > + udp_mh_recv[i].msg_hdr.msg_namelen = sasize; > > } while (flow_sidx_eq(udp_meta[i].tosidx, batchsidx)); > > > > if (pif_is_socket(batchpif)) {-- Stefano