[PATCH v2 0/7] Allow more use of iovecs in pcap and tap interfaces
While working on other stuff I stumbled across some patches I wrote quite a while back (part of a larger series on indefinite hiatus). These make it easier to use the pcap and tap functions with frames that aren't in a single contiguous buffer. This overlaps with some of the work in Laurent's vhost-user series, so I've included the first of his patches and integrated it with my changes. There will The first two patches have some overlap with the preliminary iovec patches in the vhost-user series (and will certainly conflict). I do think the pcap interface here is slightly better than the one in the vhost-user series, although I did ack that. Stefano, if you've already applied / run tests for Laurent's series then go ahead with it; I'll rework this on top of those. Changes since v1: * Stefano correctly pointed out that my iov_offset() function was unclear in its naming and description. Rename it to iov_skip_bytes() and change the wording around it to improve this. * Incorporate Laurent's iov helper patch, and also use iov_skip_bytes() to slightly simplify its functions. * Fix commit message for write_remainder() so that it correctly describes handling "short writes", not "sort writes", whatever that might mean. * Adjust parameter names and descriptions for pcap_multiple() for clarity. David Gibson (6): iov: Add helper to find skip over first n bytes of an io vector pcap: Update pcap_frame() to take an iovec and offset util: Add write_remainder() helper pcap: Handle short writes in pcap_frame() pcap: Allow pcap_frame() and pcap_multiple() to take multi-buffer frames tap: Use write_remainder() in tap_send_frames_passt() Laurent Vivier (1): iov: add some functions to manage iovec Makefile | 8 +-- iov.c | 198 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ iov.h | 31 +++++++++ pcap.c | 56 ++++++++-------- pcap.h | 3 +- tap.c | 41 +++--------- util.c | 33 ++++++++++ util.h | 1 + 8 files changed, 307 insertions(+), 64 deletions(-) create mode 100644 iov.c create mode 100644 iov.h -- 2.43.2
From: Laurent Vivier
On Wed, 28 Feb 2024 12:52:00 +1100
David Gibson
From: Laurent Vivier
Introduce functions to copy to/from a buffer from/to an iovec array, to compute data length in in bytes of an iovec and to copy memory from an iovec to another.
iov_from_buf(), iov_to_buf(), iov_size(), iov_copy().
Signed-off-by: Laurent Vivier
Message-ID: <20240217150725.661467-2-lvivier@redhat.com> [dwg: Small changes to suppress cppcheck warnings] Signed-off-by: David Gibson
I have a long list of style comments on this one, but nothing functional, so I'll go ahead and apply it to unblock this series, and post fix-up patches later. -- Stefano
Several of the IOV functions in iov.c, and also tap_send_frames_passt()
needs to determine which buffer element a byte offset into an IO vector
lies in. Split this out into a helper function iov_skip_bytes().
Signed-off-by: David Gibson
On Wed, 28 Feb 2024 12:52:01 +1100
David Gibson
Several of the IOV functions in iov.c, and also tap_send_frames_passt() needs to determine which buffer element a byte offset into an IO vector lies in. Split this out into a helper function iov_skip_bytes().
Signed-off-by: David Gibson
--- iov.c | 42 +++++++++++++++++++++++++++++++++--------- iov.h | 2 ++ tap.c | 12 +++++------- 3 files changed, 40 insertions(+), 16 deletions(-) diff --git a/iov.c b/iov.c index 8a48acb1..e3312628 100644 --- a/iov.c +++ b/iov.c @@ -25,6 +25,36 @@ #include "util.h" #include "iov.h"
+ +/* iov_skip_bytes() - Skip the first n bytes into an IO vector + * @iov: IO vector + * @n: Number of entries in @iov
...which is a different 'n' compared to two lines above.
+ * @vec_offset: Total byte offset into the IO vector
This doesn't clearly correlate with the description of the function ("first n bytes"). Same here, I have no other comments about the series, so I'll apply this and try to improve this a bit as a follow-up.
+ * @buf_offset: Offset into a single buffer of the IO vector + * + * Return: index I of individual struct iovec which contains the byte at + * @vec_offset bytes into the vector (as though all its buffers were + * contiguous). If @buf_offset is non-NULL, update it to the offset of + * that byte within @iov[I] (guaranteed to be less than @iov[I].iov_len) + * If the whole vector has <= @vec_offset bytes, return @n. + */ +size_t iov_skip_bytes(const struct iovec *iov, size_t n, + size_t vec_offset, size_t *buf_offset)
-- Stefano
On Thu, Feb 29, 2024 at 09:10:02AM +0100, Stefano Brivio wrote:
On Wed, 28 Feb 2024 12:52:01 +1100 David Gibson
wrote: Several of the IOV functions in iov.c, and also tap_send_frames_passt() needs to determine which buffer element a byte offset into an IO vector lies in. Split this out into a helper function iov_skip_bytes().
Signed-off-by: David Gibson
--- iov.c | 42 +++++++++++++++++++++++++++++++++--------- iov.h | 2 ++ tap.c | 12 +++++------- 3 files changed, 40 insertions(+), 16 deletions(-) diff --git a/iov.c b/iov.c index 8a48acb1..e3312628 100644 --- a/iov.c +++ b/iov.c @@ -25,6 +25,36 @@ #include "util.h" #include "iov.h"
+ +/* iov_skip_bytes() - Skip the first n bytes into an IO vector + * @iov: IO vector + * @n: Number of entries in @iov
...which is a different 'n' compared to two lines above.
Ouch, good point.
+ * @vec_offset: Total byte offset into the IO vector
This doesn't clearly correlate with the description of the function ("first n bytes").
Same here, I have no other comments about the series, so I'll apply this and try to improve this a bit as a follow-up.
Ok, thanks. I might see if I can improve this one too.
+ * @buf_offset: Offset into a single buffer of the IO vector + * + * Return: index I of individual struct iovec which contains the byte at + * @vec_offset bytes into the vector (as though all its buffers were + * contiguous). If @buf_offset is non-NULL, update it to the offset of + * that byte within @iov[I] (guaranteed to be less than @iov[I].iov_len) + * If the whole vector has <= @vec_offset bytes, return @n. + */ +size_t iov_skip_bytes(const struct iovec *iov, size_t n, + size_t vec_offset, size_t *buf_offset)
-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Update the low-level helper pcap_frame() to take a struct iovec and
offset within it, rather than an explicit pointer and length for the
frame. This moves the handling of an offset (to skip vnet_len) from
pcap_multiple() to pcap_frame().
This doesn't accomplish a great deal immediately, but will make
subsequent changes easier.
Signed-off-by: David Gibson
We have several places where we want to write(2) a buffer or buffers and we
handle short write()s by retrying until everything is successfully written.
Add a helper for this in util.c.
This version has some differences from the typical write_all() function.
First, take an IO vector rather than a single buffer, because that will be
useful for some of our cases. Second, allow it to take an parameter to
skip the first n bytes of the given buffers. This will be useful for some
of the cases we want, and also falls out quite naturally from the
implementation.
Signed-off-by: David Gibson
Currently pcap_frame() assumes that if write() doesn't return an error, it
has written everything we want. That's not necessarily true, because it
could return a short write. That's not likely to happen on a regular file,
but there's not a lot of reason not to be robust here; it's conceivable we
might want to direct the pcap fd at a named pipe or similar.
So, make pcap_frame() handle short frames by using the write_remainder()
helper.
Signed-off-by: David Gibson
pcap_frame() explicitly takes a single frame, and only allows a single
buffer (iovec) to be passed. pcap_multiple() takes multiple buffers, but
explicitly expects exactly one frame per buffer.
Future changes are going to want to split single frames across multiple
buffers in some circumstances, so extend the pcap functions to allow for
that.
Signed-off-by: David Gibson
When we determine we have sent a partial frame in tap_send_frames_passt(),
we call tap_send_remainder() to send the remainder of it. The logic in
that function is very similar to that in the more general write_remainder()
except that it uses send() instead of write()/writev(). But we are dealing
specifically with the qemu socket here, which is a connected stream socket.
In that case write()s do the same thing as send() with the options we were
using, so we can just reuse write_remainder().
Signed-off-by: David Gibson
On Wed, 28 Feb 2024 12:51:59 +1100
David Gibson
While working on other stuff I stumbled across some patches I wrote quite a while back (part of a larger series on indefinite hiatus). These make it easier to use the pcap and tap functions with frames that aren't in a single contiguous buffer.
This overlaps with some of the work in Laurent's vhost-user series, so I've included the first of his patches and integrated it with my changes. There will
The first two patches have some overlap with the preliminary iovec patches in the vhost-user series (and will certainly conflict). I do think the pcap interface here is slightly better than the one in the vhost-user series, although I did ack that.
Stefano, if you've already applied / run tests for Laurent's series then go ahead with it; I'll rework this on top of those.
No, I hadn't. This one is applied now. -- Stefano
participants (2)
-
David Gibson
-
Stefano Brivio