[PATCH 0/2] Don't drop outbound zero-length UDP packets over tap
passt/pasta was incorrectly dropping UDP packets with a zero-length payload when travelling out via the tap interface. This is incorrect, since for a datagram protocol, zero-length packets are still meaningful. Based on my earlier series for test command dispatch, user namespace cleanup and test temporary file handling. Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19 David Gibson (2): udp: Don't drop zero-length outbound UDP packets test: Simpler termination handling for UDP tests test/passt/udp | 23 +++++++------- test/passt_in_ns/udp | 73 ++++++++++++++++++++++---------------------- test/pasta/udp | 31 +++++++++---------- udp.c | 10 +++--- 4 files changed, 67 insertions(+), 70 deletions(-) -- 2.37.3
udp_tap_handler() currently skips outbound packets if they have a payload
length of zero. This is not correct, since in a datagram protocol zero
length packets still have meaning.
Adjust this to correctly forward the zero-length packets by using a msghdr
with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
On Fri, 9 Sep 2022 14:27:13 +1000
David Gibson
udp_tap_handler() currently skips outbound packets if they have a payload length of zero. This is not correct, since in a datagram protocol zero length packets still have meaning.
Right, nice catch. As far as I can tell it's an issue I added with commit bb708111833e ("treewide: Packet abstraction with mandatory boundary checks").
Adjust this to correctly forward the zero-length packets by using a msghdr with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
--- udp.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/udp.c b/udp.c index c4ebecc..caa852a 100644 --- a/udp.c +++ b/udp.c @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, uh_send = packet_get(p, i, 0, sizeof(*uh), &len); if (!uh_send) return p->count; + + mm[i].msg_hdr.msg_name = sa; + mm[i].msg_hdr.msg_namelen = sl; + count++; + if (!len) continue;
m[i].iov_base = (char *)(uh_send + 1); m[i].iov_len = len;
I haven't tested this yet, but: - shouldn't iov_len be set to 0 (moving also this line before)? Note that I'm not initialising m - shouldn't iov_base point to NULL to avoid noise from valgrind? Also:
- mm[i].msg_hdr.msg_name = sa; - mm[i].msg_hdr.msg_namelen = sl; - mm[i].msg_hdr.msg_iov = m + i; mm[i].msg_hdr.msg_iovlen = 1;
...I guess we should still go through those even if the size is zero, because we're appending a message. If we don't, I would expect some subsequent messages in the batch to be dropped (as many as zero sized packets we have). That is, I suppose we could just drop the continue statement on if (!len) above -- but, again, I haven't tested it. -- Stefano
On Fri, Sep 09, 2022 at 11:26:58AM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 14:27:13 +1000 David Gibson
wrote: udp_tap_handler() currently skips outbound packets if they have a payload length of zero. This is not correct, since in a datagram protocol zero length packets still have meaning.
Right, nice catch. As far as I can tell it's an issue I added with commit bb708111833e ("treewide: Packet abstraction with mandatory boundary checks").
Adjust this to correctly forward the zero-length packets by using a msghdr with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
--- udp.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/udp.c b/udp.c index c4ebecc..caa852a 100644 --- a/udp.c +++ b/udp.c @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, uh_send = packet_get(p, i, 0, sizeof(*uh), &len); if (!uh_send) return p->count; + + mm[i].msg_hdr.msg_name = sa; + mm[i].msg_hdr.msg_namelen = sl; + count++; + if (!len) continue;
m[i].iov_base = (char *)(uh_send + 1); m[i].iov_len = len;
I haven't tested this yet, but:
- shouldn't iov_len be set to 0 (moving also this line before)? Note that I'm not initialising m
- shouldn't iov_base point to NULL to avoid noise from valgrind?
No, because with this change m[i] is entirely unreferenced by mm[].
Also:
- mm[i].msg_hdr.msg_name = sa; - mm[i].msg_hdr.msg_namelen = sl; - mm[i].msg_hdr.msg_iov = m + i; mm[i].msg_hdr.msg_iovlen = 1;
...I guess we should still go through those even if the size is zero, because we're appending a message. If we don't, I would expect some subsequent messages in the batch to be dropped (as many as zero sized packets we have).
Here I'm relying on the fact that mm[] (unlike m[]) *is* initialized, so if we don't alter it here, msg_iov is NULL and msg_iovlen is 0. I was looking at removing that initialization, but I haven't gotten that working yet.
That is, I suppose we could just drop the continue statement on if (!len) above -- but, again, I haven't tested it.
My first version actually did that, so it also works, but I think setting msg_iovlen to 0 is a bit neater. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Fri, 9 Sep 2022 20:39:44 +1000
David Gibson
On Fri, Sep 09, 2022 at 11:26:58AM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 14:27:13 +1000 David Gibson
wrote: udp_tap_handler() currently skips outbound packets if they have a payload length of zero. This is not correct, since in a datagram protocol zero length packets still have meaning.
Right, nice catch. As far as I can tell it's an issue I added with commit bb708111833e ("treewide: Packet abstraction with mandatory boundary checks").
Adjust this to correctly forward the zero-length packets by using a msghdr with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
--- udp.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/udp.c b/udp.c index c4ebecc..caa852a 100644 --- a/udp.c +++ b/udp.c @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, uh_send = packet_get(p, i, 0, sizeof(*uh), &len); if (!uh_send) return p->count; + + mm[i].msg_hdr.msg_name = sa; + mm[i].msg_hdr.msg_namelen = sl; + count++; + if (!len) continue;
m[i].iov_base = (char *)(uh_send + 1); m[i].iov_len = len;
I haven't tested this yet, but:
- shouldn't iov_len be set to 0 (moving also this line before)? Note that I'm not initialising m
- shouldn't iov_base point to NULL to avoid noise from valgrind?
No, because with this change m[i] is entirely unreferenced by mm[].
Also:
- mm[i].msg_hdr.msg_name = sa; - mm[i].msg_hdr.msg_namelen = sl; - mm[i].msg_hdr.msg_iov = m + i; mm[i].msg_hdr.msg_iovlen = 1;
...I guess we should still go through those even if the size is zero, because we're appending a message. If we don't, I would expect some subsequent messages in the batch to be dropped (as many as zero sized packets we have).
Here I'm relying on the fact that mm[] (unlike m[]) *is* initialized, so if we don't alter it here, msg_iov is NULL and msg_iovlen is 0.
I was looking at removing that initialization, but I haven't gotten that working yet.
Oops, I see now. So, I suppose that if you want to drop that initialisation, you might need to zero msg_hdr.controllen as well. And msg_hdr.control too: other than keeping valgrind happy, not leaking random stuff to the kernel might make this marginally more secure. That should be better than the huge memset() at the beginning, because we're already writing to msg_iovlen anyway. If you already tried that, though, I don't have any other quick idea. By the way, I had a mechanism in place, just for TCP though, to avoid reassigning those pointers and also length descriptors. I got rid of it in commit 38fbfdbcb95d ("tcp: Get rid of iov with cached MSS, drop sendmmsg(), add deferred flush") because it didn't really help with throughput. I don't see any significant "userspace" overhead on guest-to-host TCP paths with perf(1). ...maybe for UDP that's different, I haven't focused that much on UDP performance.
That is, I suppose we could just drop the continue statement on if (!len) above -- but, again, I haven't tested it.
My first version actually did that, so it also works, but I think setting msg_iovlen to 0 is a bit neater.
Right. Maybe it was just me being thick, or perhaps that could use a comment: /* Zero-length packet: don't use any buffer, msg_iovlen is 0 */ if (!len) continue; -- Stefano
On Fri, Sep 09, 2022 at 06:06:59PM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 20:39:44 +1000 David Gibson
wrote: On Fri, Sep 09, 2022 at 11:26:58AM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 14:27:13 +1000 David Gibson
wrote: udp_tap_handler() currently skips outbound packets if they have a payload length of zero. This is not correct, since in a datagram protocol zero length packets still have meaning.
Right, nice catch. As far as I can tell it's an issue I added with commit bb708111833e ("treewide: Packet abstraction with mandatory boundary checks").
Adjust this to correctly forward the zero-length packets by using a msghdr with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
--- udp.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/udp.c b/udp.c index c4ebecc..caa852a 100644 --- a/udp.c +++ b/udp.c @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, uh_send = packet_get(p, i, 0, sizeof(*uh), &len); if (!uh_send) return p->count; + + mm[i].msg_hdr.msg_name = sa; + mm[i].msg_hdr.msg_namelen = sl; + count++; + if (!len) continue;
m[i].iov_base = (char *)(uh_send + 1); m[i].iov_len = len;
I haven't tested this yet, but:
- shouldn't iov_len be set to 0 (moving also this line before)? Note that I'm not initialising m
- shouldn't iov_base point to NULL to avoid noise from valgrind?
No, because with this change m[i] is entirely unreferenced by mm[].
Also:
- mm[i].msg_hdr.msg_name = sa; - mm[i].msg_hdr.msg_namelen = sl; - mm[i].msg_hdr.msg_iov = m + i; mm[i].msg_hdr.msg_iovlen = 1;
...I guess we should still go through those even if the size is zero, because we're appending a message. If we don't, I would expect some subsequent messages in the batch to be dropped (as many as zero sized packets we have).
Here I'm relying on the fact that mm[] (unlike m[]) *is* initialized, so if we don't alter it here, msg_iov is NULL and msg_iovlen is 0.
I was looking at removing that initialization, but I haven't gotten that working yet.
Oops, I see now.
So, I suppose that if you want to drop that initialisation, you might need to zero msg_hdr.controllen as well.
Duh. I completely failed to consider the other fields. I actually suspect msg_hdr.flags is the most vital one (without flags I don't know if it will examine control or controllen). But in any case I'm initializing them all now and it's working.
And msg_hdr.control too: other than keeping valgrind happy, not leaking random stuff to the kernel might make this marginally more secure.
That should be better than the huge memset() at the beginning, because we're already writing to msg_iovlen anyway.
If you already tried that, though, I don't have any other quick idea.
By the way, I had a mechanism in place, just for TCP though, to avoid reassigning those pointers and also length descriptors.
I got rid of it in commit 38fbfdbcb95d ("tcp: Get rid of iov with cached MSS, drop sendmmsg(), add deferred flush") because it didn't really help with throughput. I don't see any significant "userspace" overhead on guest-to-host TCP paths with perf(1).
...maybe for UDP that's different, I haven't focused that much on UDP performance.
That is, I suppose we could just drop the continue statement on if (!len) above -- but, again, I haven't tested it.
My first version actually did that, so it also works, but I think setting msg_iovlen to 0 is a bit neater.
Right. Maybe it was just me being thick, or perhaps that could use a comment:
/* Zero-length packet: don't use any buffer, msg_iovlen is 0 */ if (!len) continue;
-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Tue, 13 Sep 2022 16:39:26 +1000
David Gibson
On Fri, Sep 09, 2022 at 06:06:59PM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 20:39:44 +1000 David Gibson
wrote: On Fri, Sep 09, 2022 at 11:26:58AM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 14:27:13 +1000 David Gibson
wrote: udp_tap_handler() currently skips outbound packets if they have a payload length of zero. This is not correct, since in a datagram protocol zero length packets still have meaning.
Right, nice catch. As far as I can tell it's an issue I added with commit bb708111833e ("treewide: Packet abstraction with mandatory boundary checks").
Adjust this to correctly forward the zero-length packets by using a msghdr with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
--- udp.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/udp.c b/udp.c index c4ebecc..caa852a 100644 --- a/udp.c +++ b/udp.c @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, uh_send = packet_get(p, i, 0, sizeof(*uh), &len); if (!uh_send) return p->count; + + mm[i].msg_hdr.msg_name = sa; + mm[i].msg_hdr.msg_namelen = sl; + count++; + if (!len) continue;
m[i].iov_base = (char *)(uh_send + 1); m[i].iov_len = len;
I haven't tested this yet, but:
- shouldn't iov_len be set to 0 (moving also this line before)? Note that I'm not initialising m
- shouldn't iov_base point to NULL to avoid noise from valgrind?
No, because with this change m[i] is entirely unreferenced by mm[].
Also:
- mm[i].msg_hdr.msg_name = sa; - mm[i].msg_hdr.msg_namelen = sl; - mm[i].msg_hdr.msg_iov = m + i; mm[i].msg_hdr.msg_iovlen = 1;
...I guess we should still go through those even if the size is zero, because we're appending a message. If we don't, I would expect some subsequent messages in the batch to be dropped (as many as zero sized packets we have).
Here I'm relying on the fact that mm[] (unlike m[]) *is* initialized, so if we don't alter it here, msg_iov is NULL and msg_iovlen is 0.
I was looking at removing that initialization, but I haven't gotten that working yet.
Oops, I see now.
So, I suppose that if you want to drop that initialisation, you might need to zero msg_hdr.controllen as well.
Duh. I completely failed to consider the other fields. I actually suspect msg_hdr.flags is the most vital one (without flags I don't know if it will examine control or controllen).
Hmm, if we're talking about msg_flags, it should be ignored on sendmsg(), and only used for received messages flags (MSG_EOR, MSG_TRUNC, MSG_CTRUNC, MSG_OOB, MSG_ERRQUEUE) on recvmsg(). But,
But in any case I'm initializing them all now and it's working.
yes, I guess it's a good idea to avoid sending the kernel random bytes there, in any case. -- Stefano
On Tue, Sep 13, 2022 at 10:08:46AM +0100, Stefano Brivio wrote:
On Tue, 13 Sep 2022 16:39:26 +1000 David Gibson
wrote: On Fri, Sep 09, 2022 at 06:06:59PM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 20:39:44 +1000 David Gibson
wrote: On Fri, Sep 09, 2022 at 11:26:58AM +0200, Stefano Brivio wrote:
On Fri, 9 Sep 2022 14:27:13 +1000 David Gibson
wrote: udp_tap_handler() currently skips outbound packets if they have a payload length of zero. This is not correct, since in a datagram protocol zero length packets still have meaning.
Right, nice catch. As far as I can tell it's an issue I added with commit bb708111833e ("treewide: Packet abstraction with mandatory boundary checks").
Adjust this to correctly forward the zero-length packets by using a msghdr with msg_iovlen == 0.
Bugzilla: https://bugs.passt.top/show_bug.cgi?id=19
Signed-off-by: David Gibson
--- udp.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/udp.c b/udp.c index c4ebecc..caa852a 100644 --- a/udp.c +++ b/udp.c @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, const void *addr, uh_send = packet_get(p, i, 0, sizeof(*uh), &len); if (!uh_send) return p->count; + + mm[i].msg_hdr.msg_name = sa; + mm[i].msg_hdr.msg_namelen = sl; + count++; + if (!len) continue;
m[i].iov_base = (char *)(uh_send + 1); m[i].iov_len = len;
I haven't tested this yet, but:
- shouldn't iov_len be set to 0 (moving also this line before)? Note that I'm not initialising m
- shouldn't iov_base point to NULL to avoid noise from valgrind?
No, because with this change m[i] is entirely unreferenced by mm[].
Also:
- mm[i].msg_hdr.msg_name = sa; - mm[i].msg_hdr.msg_namelen = sl; - mm[i].msg_hdr.msg_iov = m + i; mm[i].msg_hdr.msg_iovlen = 1;
...I guess we should still go through those even if the size is zero, because we're appending a message. If we don't, I would expect some subsequent messages in the batch to be dropped (as many as zero sized packets we have).
Here I'm relying on the fact that mm[] (unlike m[]) *is* initialized, so if we don't alter it here, msg_iov is NULL and msg_iovlen is 0.
I was looking at removing that initialization, but I haven't gotten that working yet.
Oops, I see now.
So, I suppose that if you want to drop that initialisation, you might need to zero msg_hdr.controllen as well.
Duh. I completely failed to consider the other fields. I actually suspect msg_hdr.flags is the most vital one (without flags I don't know if it will examine control or controllen).
Hmm, if we're talking about msg_flags, it should be ignored on sendmsg(), and only used for received messages flags (MSG_EOR, MSG_TRUNC, MSG_CTRUNC, MSG_OOB, MSG_ERRQUEUE) on recvmsg().
Oh, right, I was mixing up msg_flags and the separate flags argument to sendmsg() and sendmmsg().
But,
But in any case I'm initializing them all now and it's working.
yes, I guess it's a good idea to avoid sending the kernel random bytes there, in any case.
Right. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Because UDP is connectionless we don't have an in-built end-of-stream
signal for our connectivity tests. We work around this by explicitly
adding an end marker to our sample data and killing the listening end once
it is seen.
However, socat has some built-in options - null-eof and shut-null - which
can be used to signal the end of stream with a zero-length UDP packet.
Use these to simplify how the UDP tests are implemented.
Signed-off-by: David Gibson
participants (2)
-
David Gibson
-
Stefano Brivio