On Fri, Sep 09, 2022 at 06:06:59PM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 20:39:44 +1000 David Gibson
wrote: On Fri, Sep 09, 2022 at 11:26:58AM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 14:27:13 +1000 David Gibson
wrote: udp_tap_handler() currently skips outbound packets if they have a payload length of zero. This is not correct, since in a datagram protocol zero length packets still have meaning.
Right, nice catch. As far as I can tell it's an issue I added with commit bb708111833e ("treewide: Packet abstraction with mandatory boundary checks").
Adjust this to correctly forward the zero-length packets by using a msghdr with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
--- udp.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/udp.c b/udp.c index c4ebecc..caa852a 100644 --- a/udp.c +++ b/udp.c @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, uh_send = packet_get(p, i, 0, sizeof(*uh), &len); if (!uh_send) return p->count; + + mm[i].msg_hdr.msg_name = sa; + mm[i].msg_hdr.msg_namelen = sl; + count++; + if (!len) continue;
m[i].iov_base = (char *)(uh_send + 1); m[i].iov_len = len;
I haven't tested this yet, but:
- shouldn't iov_len be set to 0 (moving also this line before)? Note that I'm not initialising m
- shouldn't iov_base point to NULL to avoid noise from valgrind?
No, because with this change m[i] is entirely unreferenced by mm[].
Also:
- mm[i].msg_hdr.msg_name = sa; - mm[i].msg_hdr.msg_namelen = sl; - mm[i].msg_hdr.msg_iov = m + i; mm[i].msg_hdr.msg_iovlen = 1;
...I guess we should still go through those even if the size is zero, because we're appending a message. If we don't, I would expect some subsequent messages in the batch to be dropped (as many as zero sized packets we have).
Here I'm relying on the fact that mm[] (unlike m[]) *is* initialized, so if we don't alter it here, msg_iov is NULL and msg_iovlen is 0.
I was looking at removing that initialization, but I haven't gotten that working yet.
Oops, I see now.
So, I suppose that if you want to drop that initialisation, you might need to zero msg_hdr.controllen as well.
Duh. I completely failed to consider the other fields. I actually suspect msg_hdr.flags is the most vital one (without flags I don't know if it will examine control or controllen). But in any case I'm initializing them all now and it's working.
And msg_hdr.control too: other than keeping valgrind happy, not leaking random stuff to the kernel might make this marginally more secure.
That should be better than the huge memset() at the beginning, because we're already writing to msg_iovlen anyway.
If you already tried that, though, I don't have any other quick idea.
By the way, I had a mechanism in place, just for TCP though, to avoid reassigning those pointers and also length descriptors.
I got rid of it in commit 38fbfdbcb95d ("tcp: Get rid of iov with cached MSS, drop sendmmsg(), add deferred flush") because it didn't really help with throughput. I don't see any significant "userspace" overhead on guest-to-host TCP paths with perf(1).
...maybe for UDP that's different, I haven't focused that much on UDP performance.
That is, I suppose we could just drop the continue statement on if (!len) above -- but, again, I haven't tested it.
My first version actually did that, so it also works, but I think setting msg_iovlen to 0 is a bit neater.
Right. Maybe it was just me being thick, or perhaps that could use a comment:
/* Zero-length packet: don't use any buffer, msg_iovlen is 0 */ if (!len) continue;
-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson