[PATCH 0/7] Rework some IOV handling in TCP code
These reworks are largely aimed at making the vhost-user integration easier, and with luck allowing more logic to be shared between it and the existing "buffer" paths. Of course, in the short term, these will probably conflict with the patches... I hope it ends up as a net positive, Laurent, let me know. I think a number of similar changes should be possible for UDP, but I haven't tackled that yet. David Gibson (7): tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]() tcp: Move tcp_l2_buf_fill_headers() to tcp_buf.c tcp: Rework tcp_l2_buf_fill_headers() into tcp_buf_make_frame() tcp: Don't use return value from tcp_fill_headers[46] to adjust iov_len tcp: Pass TCP header and payload separately to tcp_fill_headers[46]() tcp: Merge tcp_update_check_tcp[46]() tcp: Fold tcp_update_csum() into tcp_fill_header() tcp.c | 232 +++++++++++-------------------------------------- tcp_buf.c | 48 +++++++--- tcp_internal.h | 15 +++- 3 files changed, 100 insertions(+), 195 deletions(-) -- 2.47.0
Currently these expects both the TCP header and payload in a single IOV,
and goes to some trouble to locate the checksum field within it. In the
current caller we've already know where the TCP header is, so we might as
well just pass it in. This will need to work a bit differently for
vhost-user, but that code already needs to locate the TCP header for other
reasons, so again we can just pass it in.
Signed-off-by: David Gibson
On Mon, 28 Oct 2024 20:40:44 +1100
David Gibson
Currently these expects both the TCP header and payload in a single IOV, and goes to some trouble to locate the checksum field within it. In the current caller we've already know where the TCP header is, so we might as well just pass it in. This will need to work a bit differently for vhost-user, but that code already needs to locate the TCP header for other reasons, so again we can just pass it in.
We couldn't do this, and also what you're now doing in 5/7, because
with vhost-user the TCP header is not aligned, so we can't pass it
around as a pointer, see:
On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
On Mon, 28 Oct 2024 20:40:44 +1100 David Gibson
wrote: Currently these expects both the TCP header and payload in a single IOV, and goes to some trouble to locate the checksum field within it. In the current caller we've already know where the TCP header is, so we might as well just pass it in. This will need to work a bit differently for vhost-user, but that code already needs to locate the TCP header for other reasons, so again we can just pass it in.
We couldn't do this, and also what you're now doing in 5/7, because with vhost-user the TCP header is not aligned, so we can't pass it around as a pointer, see:
https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/ and following. That one is about IP headers, but the same applies to TCP and UDP headers.
Hrm. I'm aware it theoretically need not be aligned, but I thought it was in practice.. and that we were already relying on that. In fact, I'm pretty sure the second part is true, although more subtly than here. v8 of the vhost-user patches calls tcp_fill_headers[46]() with the bp parameter set to the offset of the TCP header. If creating a tcphdr * there is a problem, then creating a tcp_payload_t * can't be any better.
Of course the current solution is not elegant and it would be nice to find another way to deal with it, but we couldn't come up with anything better back then.
The rest of the series looks good to me, but I'm afraid that without this one and 5/7 the other changes will be a bit more complicated to implement (if at all possible).
Definitely. I have so ideas for approaches more robust to misalignment, but they're substantially more complicated. I was hoping we could avoid it at least for now. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
On Mon, 28 Oct 2024 20:40:44 +1100 David Gibson
wrote: Currently these expects both the TCP header and payload in a single IOV, and goes to some trouble to locate the checksum field within it. In the current caller we've already know where the TCP header is, so we might as well just pass it in. This will need to work a bit differently for vhost-user, but that code already needs to locate the TCP header for other reasons, so again we can just pass it in.
We couldn't do this, and also what you're now doing in 5/7, because with vhost-user the TCP header is not aligned, so we can't pass it around as a pointer, see:
https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/ and following. That one is about IP headers, but the same applies to TCP and UDP headers.
Hrm. I'm aware it theoretically need not be aligned, but I thought it was in practice.. and that we were already relying on that.
In fact, I'm pretty sure the second part is true, although more subtly than here. v8 of the vhost-user patches calls tcp_fill_headers[46]() with the bp parameter set to the offset of the TCP header. If creating a tcphdr * there is a problem, then creating a tcp_payload_t * can't be any better.
Of course the current solution is not elegant and it would be nice to find another way to deal with it, but we couldn't come up with anything better back then.
The rest of the series looks good to me, but I'm afraid that without this one and 5/7 the other changes will be a bit more complicated to implement (if at all possible).
Definitely. I have so ideas for approaches more robust to misalignment, but they're substantially more complicated. I was hoping we could avoid it at least for now.
I had a closer look at that earlier message now. I believe at the time I was aiming for fully robust handling of misaligned user buffers. AIUI, we've given up on that for the time being: instead we'll just *test* for suitable alignment and we can do the hard work of handling it if it ever arises in practice. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Tue, 29 Oct 2024 15:07:56 +1100
David Gibson
On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
On Mon, 28 Oct 2024 20:40:44 +1100 David Gibson
wrote: Currently these expects both the TCP header and payload in a single IOV, and goes to some trouble to locate the checksum field within it. In the current caller we've already know where the TCP header is, so we might as well just pass it in. This will need to work a bit differently for vhost-user, but that code already needs to locate the TCP header for other reasons, so again we can just pass it in.
We couldn't do this, and also what you're now doing in 5/7, because with vhost-user the TCP header is not aligned, so we can't pass it around as a pointer, see:
https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/ and following. That one is about IP headers, but the same applies to TCP and UDP headers.
Hrm. I'm aware it theoretically need not be aligned, but I thought it was in practice.. and that we were already relying on that.
In fact, I'm pretty sure the second part is true, although more subtly than here. v8 of the vhost-user patches calls tcp_fill_headers[46]() with the bp parameter set to the offset of the TCP header. If creating a tcphdr * there is a problem, then creating a tcp_payload_t * can't be any better.
Ah, okay, I missed that. Still, I think we should ask gcc for an opinion (with the vhost-user series on top of this series), because those build-time pointer alignment checks are pretty reliable.
Of course the current solution is not elegant and it would be nice to find another way to deal with it, but we couldn't come up with anything better back then.
The rest of the series looks good to me, but I'm afraid that without this one and 5/7 the other changes will be a bit more complicated to implement (if at all possible).
Definitely. I have so ideas for approaches more robust to misalignment, but they're substantially more complicated. I was hoping we could avoid it at least for now.
I had a closer look at that earlier message now. I believe at the time I was aiming for fully robust handling of misaligned user buffers. AIUI, we've given up on that for the time being: instead we'll just *test* for suitable alignment and we can do the hard work of handling it if it ever arises in practice.
Right, and we can use the compiler to test for suitable alignment. -- Stefano
On Tue, Oct 29, 2024 at 10:09:54AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 15:07:56 +1100 David Gibson
wrote: On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
On Mon, 28 Oct 2024 20:40:44 +1100 David Gibson
wrote: Currently these expects both the TCP header and payload in a single IOV, and goes to some trouble to locate the checksum field within it. In the current caller we've already know where the TCP header is, so we might as well just pass it in. This will need to work a bit differently for vhost-user, but that code already needs to locate the TCP header for other reasons, so again we can just pass it in.
We couldn't do this, and also what you're now doing in 5/7, because with vhost-user the TCP header is not aligned, so we can't pass it around as a pointer, see:
https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/ and following. That one is about IP headers, but the same applies to TCP and UDP headers.
Hrm. I'm aware it theoretically need not be aligned, but I thought it was in practice.. and that we were already relying on that.
In fact, I'm pretty sure the second part is true, although more subtly than here. v8 of the vhost-user patches calls tcp_fill_headers[46]() with the bp parameter set to the offset of the TCP header. If creating a tcphdr * there is a problem, then creating a tcp_payload_t * can't be any better.
Ah, okay, I missed that. Still, I think we should ask gcc for an opinion (with the vhost-user series on top of this series), because those build-time pointer alignment checks are pretty reliable.
I'm not exactly sure what you're suggesting with this. I don't think the compiler will catch it in this case, because we're constructing the (possibly) misaligned pointer as a (void *), then implicitly casting it by passing it to a (tcp_payload_t *) argument. (void *) is explicitly allowed to be cast to any pointer type, so I think the compiler will take this as asserting we know what we're doing. More fool it.
Of course the current solution is not elegant and it would be nice to find another way to deal with it, but we couldn't come up with anything better back then.
The rest of the series looks good to me, but I'm afraid that without this one and 5/7 the other changes will be a bit more complicated to implement (if at all possible).
Definitely. I have so ideas for approaches more robust to misalignment, but they're substantially more complicated. I was hoping we could avoid it at least for now.
I had a closer look at that earlier message now. I believe at the time I was aiming for fully robust handling of misaligned user buffers. AIUI, we've given up on that for the time being: instead we'll just *test* for suitable alignment and we can do the hard work of handling it if it ever arises in practice.
Right, and we can use the compiler to test for suitable alignment.
I do see allowing the compiler to check this in more cases as an advantage of using explicitly typed pointers where we can. Btw, I didn't find a use for it just yet, but I also have a draft patch which adds a function+macro that extracts a typed pointer from a given offset into a IO vector, verifying that it's contiguous and properly aligned. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Tue, 29 Oct 2024 20:26:25 +1100
David Gibson
On Tue, Oct 29, 2024 at 10:09:54AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 15:07:56 +1100 David Gibson
wrote: On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
On Mon, 28 Oct 2024 20:40:44 +1100 David Gibson
wrote: Currently these expects both the TCP header and payload in a single IOV, and goes to some trouble to locate the checksum field within it. In the current caller we've already know where the TCP header is, so we might as well just pass it in. This will need to work a bit differently for vhost-user, but that code already needs to locate the TCP header for other reasons, so again we can just pass it in.
We couldn't do this, and also what you're now doing in 5/7, because with vhost-user the TCP header is not aligned, so we can't pass it around as a pointer, see:
https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/ and following. That one is about IP headers, but the same applies to TCP and UDP headers.
Hrm. I'm aware it theoretically need not be aligned, but I thought it was in practice.. and that we were already relying on that.
In fact, I'm pretty sure the second part is true, although more subtly than here. v8 of the vhost-user patches calls tcp_fill_headers[46]() with the bp parameter set to the offset of the TCP header. If creating a tcphdr * there is a problem, then creating a tcp_payload_t * can't be any better.
Ah, okay, I missed that. Still, I think we should ask gcc for an opinion (with the vhost-user series on top of this series), because those build-time pointer alignment checks are pretty reliable.
I'm not exactly sure what you're suggesting with this. I don't think the compiler will catch it in this case, because we're constructing the (possibly) misaligned pointer as a (void *), then implicitly casting it by passing it to a (tcp_payload_t *) argument. (void *) is explicitly allowed to be cast to any pointer type, so I think the compiler will take this as asserting we know what we're doing. More fool it.
Oh, hm, right. In the original case we were discussing in that thread it was coming from an offset in a static struct, but if it's not the case anymore, then we should check ourselves I guess (possibly with the function + macro you mention below?).
Of course the current solution is not elegant and it would be nice to find another way to deal with it, but we couldn't come up with anything better back then.
The rest of the series looks good to me, but I'm afraid that without this one and 5/7 the other changes will be a bit more complicated to implement (if at all possible).
Definitely. I have so ideas for approaches more robust to misalignment, but they're substantially more complicated. I was hoping we could avoid it at least for now.
I had a closer look at that earlier message now. I believe at the time I was aiming for fully robust handling of misaligned user buffers. AIUI, we've given up on that for the time being: instead we'll just *test* for suitable alignment and we can do the hard work of handling it if it ever arises in practice.
Right, and we can use the compiler to test for suitable alignment.
I do see allowing the compiler to check this in more cases as an advantage of using explicitly typed pointers where we can.
Btw, I didn't find a use for it just yet, but I also have a draft patch which adds a function+macro that extracts a typed pointer from a given offset into a IO vector, verifying that it's contiguous and properly aligned.
-- Stefano
This function only has callers in tcp_buf.c. More importantly, it's
inherently tied to the "buf" path, because it uses internal knowledge of
how we lay out the various headers across our locally allocated buffers.
Therefore, move it to tcp_buf.c.
Signed-off-by: David Gibson
tcp_l2_buf_fill_headers() is always followed by updating the payload IOV
entry to the correct length of the frame. It already needs knowledge of
the frame/IOV layout, so we might as well perform that update inside the
function. Rename it to tcp_buf_make_frame() to reflect its expanded
duties.
While we're there use some temporaries to make our dissection of the IOV a
bit clearer.
Signed-off-by: David Gibson
Currently tcp_fill_headers[46] return the size of the IP payload, which
we use to adjust the size of the last IOV entry for the frame, so that it
only includes the expected data. This was originally done to isolate
knowledge of the header layout to the header building functions. However,
we since reorganized from a single buffer for the frame to an IO vector of
pieces, which means we already know something about the layout in the
caller.
Use that knowledge to adjust iov_len *before* we call tcp_fill_headers*().
This means that the header building functions are called with the IOV
containing the frame and only the frame, which will be useful later on.
Signed-off-by: David Gibson
At the moment these take separate pointers to the tap specific and IP
headers, but expect the TCP header and payload as a single tcp_payload_t.
As well as being slightly inconsistent, this involves some slightly iffy
pointer shenanigans when called on the flags path with a tcp_flags_t
instead of a tcp_payload_t.
More importantly, it's inconvenient for the upcoming vhost-user case, where
the TCP header and payload might not be contiguous. Furthermore, the
payload itself might not be contiguous.
So, pass the TCP header as its own pointer, and the TCP payload as an IO
vector.
Signed-off-by: David Gibson
The only reason we need separate functions for the IPv4 and IPv6 case is
to calculate the checksum of the IP pseudo-header, which is different for
the two cases. However, the caller already knows which path it's on and
can access the values needed for the pseudo-header partial sum more easily
than tcp_update_check_tcp[46]() can.
So, merge these functions into a single tcp_update_csum() function that
just takes the pseudo-header partial sum, calculated in the caller.
Signed-off-by: David Gibson
tcp_update_csum() is now simple enough that it makes sense to just fold it
into tcp_fill_header(), meaning the latter now really does fill all the
header fields.
Signed-off-by: David Gibson
participants (2)
-
David Gibson
-
Stefano Brivio