[PATCH v11 00/30] Introduce discontiguous frames management
This series introduces iov_tail to convey frame information between functions. v11: - invert logic of pool_can_fit() and add iov_tail_prune() v10: - rename iov_tail_drop() to iov_drop_header() - rename pool_full() to pool_can_fit() - replace NOLINTNEXTLINE(clang-analyzer-core.NonNullParamChecker) by ASSERT(iov[i].iov_base); - update iov_tail_clone() comment header v9: - address comments from David v8: - rebase - rework the two last patches to store the iovec in the p->pkt array v7: - Add a patch to fix comment style of 'Return:' - Fix ignore_arp()/accept_arp() - Fix coverity error - Fix several comments v6: - Replaced iov_slice() with the clearer iov_tail_clone() for creating iovec subsets. - Standardized local header variable names (to *_storage suffix). - Renamed functions for better semantics (e.g., ignore_arp to accept_arp, packet_data to packet_get). - Corrected OPTLEN_MAX definition in TCP. - Addressed minor logic issues (e.g., DHCPv6 FQDN flags, NDP null check). - Updated ipv6_l4hdr() return type to boolean. - Improved comments and documentation across several modules. v5: - store in the pool iovec array with several entries v4: Prepare to introduce iovec array in the pool: - passe iov_tail rather than pool to ndp,icmp, dhcp, dhcpv6 and arp - remove unused pool macros - add memory regions in the pool structure, this will allow us to use the buf pointer to store the iovec array for vhost-user v3: Address comments from David Laurent Vivier (30): arp: Don't mix incoming and outgoing buffers iov: Introduce iov_tail_clone() and iov_drop_header(). iov: Update IOV_REMOVE_HEADER() and IOV_PEEK_HEADER() tap: Use iov_tail with tap_add_packet() packet: Use iov_tail with packet_add() packet: Add packet_data() arp: Convert to iov_tail ndp: Convert to iov_tail icmp: Convert to iov_tail udp: Convert to iov_tail tcp: Convert tcp_tap_handler() to use iov_tail tcp: Convert tcp_data_from_tap() to use iov_tail dhcpv6: move offset initialization out of dhcpv6_opt() dhcpv6: Extract sending of NotOnLink status dhcpv6: Convert to iov_tail dhcpv6: Use iov_tail in dhcpv6_opt() dhcp: Convert to iov_tail ip: Use iov_tail in ipv6_l4hdr() tap: Convert tap4_handler() to iov_tail tap: Convert tap6_handler() to iov_tail packet: rename packet_data() to packet_get() arp: use iov_tail rather than pool dhcp: use iov_tail rather than pool dhcpv6: use iov_tail rather than pool icmp: use iov_tail rather than pool ndp: use iov_tail rather than pool packet: remove PACKET_POOL() and PACKET_POOL_P() packet: remove unused parameter from PACKET_POOL_DECL() packet: Refactor vhost-user memory region handling packet: Add support for multi-vector packets arp.c | 86 +++++++++++++------- arp.h | 2 +- dhcp.c | 48 ++++++----- dhcp.h | 2 +- dhcpv6.c | 223 +++++++++++++++++++++++++++++++-------------------- dhcpv6.h | 2 +- icmp.c | 40 +++++---- icmp.h | 2 +- iov.c | 101 ++++++++++++++++++++--- iov.h | 58 ++++++++++---- ip.c | 33 ++++---- ip.h | 3 +- ndp.c | 16 +++- ndp.h | 4 +- packet.c | 146 ++++++++++++++++++--------------- packet.h | 45 ++++------- pcap.c | 1 + tap.c | 119 +++++++++++++++------------ tap.h | 4 +- tcp.c | 61 +++++++++----- tcp_buf.c | 2 +- udp.c | 33 +++++--- vhost_user.c | 28 +++---- virtio.c | 4 +- virtio.h | 18 ++++- vu_common.c | 48 ++++------- 26 files changed, 693 insertions(+), 436 deletions(-) -- 2.50.1
Don't use the memory of the incoming packet to build the outgoing buffer
as it can be memory of the TX queue in the case of vhost-user.
Moreover with vhost-user, the packet can be split across several
iovec and it's easier to rebuild it in a buffer than updating an
existing iovec array.
Signed-off-by: Laurent Vivier
These utilities enhance iov_tail manipulation, useful for
efficient packet processing by enabling iovec array cloning and
header stripping without data copies.
- iov_drop_header(): Discards a specified number of bytes from the
beginning of an iov_tail by advancing its internal offset and pruning
consumed elements.
- iov_tail_clone(): Clone an iov_tail into an iovec array, adjusting the
first iovec entry to remove the iov_tail offset.
Signed-off-by: Laurent Vivier
Provide a temporary variable of the wanted type to store
the header if the memory in the iovec array is not contiguous.
Signed-off-by: Laurent Vivier
Use IOV_PEEK_HEADER() to get the ethernet header from the iovec.
Move the workaround about multiple iovec array from vu_handle_tx() to
tap_add_packet(). Removing the offset out of the iovec array should
reduce the iovec count to 1.
Signed-off-by: Laurent Vivier
Modify the interface of packet_add_do() to take an iov_tail
rather than a memory pointer and length.
Internally it only supports iovec array with only one entry,
after being pruned. We can accept iovec array with several
entries if the offset allows the function to reduce the number
of entries to 1.
tap4_handler() is updated to create an iov_tail value using
IOV_TAIL_FROM_BUF() from the buffer and the length.
Signed-off-by: Laurent Vivier
packet_data() gets the data range from a packet descriptor from a
given pool.
It uses iov_tail to return the packet memory.
packet_data() will be renamed to replace packet_get() later.
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_PEEK_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and iov_remove_header_() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_PEEK_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
No functional change.
Currently, if dhcpv6_opt() is called with offset set to 0, it will set the
offset to point to DHCPv6 options offset.
To simplify the use of iovec_tail in a later patch, move the initialization
out of the function. Replace all the call using 0 by a call using
the offset of the DHCPv6 options.
Signed-off-by: Laurent Vivier
Extract code from dhcpv6() into a new function, dhcpv6_send_ia_notonlink()
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
dhcpv6_opt() and its callers are refactored for iov_tail option parsing,
replacing direct offset management for improved robustness.
Its signature is now `bool dhcpv6_opt(iov_tail *data, type)`. `*data` (in/out)
points to a found option on `true` return or is restored on `false`.
The main dhcpv6() function uses IOV_REMOVE_HEADER for the msg_hdr, then
passes the iov_tail (now at options start) to the new dhcpv6_opt().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_PEEK_HEADER()
rather than packet_get().
Signed-off-by: Laurent Vivier
Use packet_data() and extract headers using IOV_REMOVE_HEADER()
and IOV_PEEK_HEADER() rather than packet_get().
Remove packet_get() as it is not used anymore.
Signed-off-by: Laurent Vivier
As we have removed packet_get(), we can rename packet_data() to packet_get()
as the name is clearer.
Signed-off-by: Laurent Vivier
The arp() function signature is changed to accept `struct iov_tail *data`
directly, replacing the previous `const struct pool *p` parameter.
Consequently, arp() no longer fetches packet data internally using
packet_data(), streamlining its logic.
This simplifies callers like tap4_handler(), which now pass the iov_tail
for the L2 ARP frame directly, removing intermediate pool handling.
Signed-off-by: Laurent Vivier
This patch refactors the dhcp() function to accept `struct iov_tail *data`
directly as its packet input, replacing the previous `const struct pool *p`
parameter. Consequently, dhcp() no longer fetches packet data internally
using packet_data().
This change simplifies callers, such as tap4_handler(), which now pass
the iov_tail representing the L2 frame directly to dhcp(). This removes
the need for intermediate packet pool handling for DHCP processing.
Signed-off-by: Laurent Vivier
This patch refactors the dhcpv6() function to accept `struct iov_tail *data`
directly as its packet input, replacing the `const struct pool *p` parameter.
Consequently, dhcpv6() no longer fetches packet data internally using
packet_data().
This change simplifies callers, such as tap6_handler(), which now pass
the iov_tail representing the L4 UDP segment (DHCPv6 message) directly.
This removes the need for intermediate packet pool handling.
Signed-off-by: Laurent Vivier
This patch refactors the icmp_tap_handler() function to accept
`struct iov_tail *data` directly as its packet input, replacing the
`const struct pool *p` parameter.
This change simplifies callers, such as tap4_handler(), which now pass
the iov_tail representing the L4 ICMP message directly.
This removes the need for intermediate packet pool handling.
Signed-off-by: Laurent Vivier
The ndp() function signature is changed to accept `struct iov_tail *data`
directly, replacing the previous `const struct pool *p` and
`const struct icmp6hdr *ih` parameters.
This change simplifies callers, like tap6_handler(), which now provide
the iov_tail representing the L4 ICMPv6 segment directly to ndp().
Signed-off-by: Laurent Vivier
These macros are no longer used following the refactoring of packet
handlers to directly use iov_tail. Callers no longer require PACKET_POOL_P
for temporary pools, and PACKET_POOL can be replaced by PACKET_POOL_DECL
and separate initialization if needed.
Signed-off-by: Laurent Vivier
_buf is not used in the macro. Remove it.
Remove it also from PACKET_POOL_NOINIT() as it was needed
for PACKET_POOL_DECL().
Signed-off-by: Laurent Vivier
This patch refactors the handling of vhost-user memory regions by
introducing a new `struct vdev_memory` to encapsulate the regions
array and their count (`nregions`) within the main `vu_dev` structure.
This new `vdev_memory` structure is then passed to the packet pool by
re-using the existing `p->buf` field. A `p->buf_size` of 0 indicates
that `p->buf` holds a pointer to `struct vdev_memory` instead of a
regular packet buffer. A new helper, `get_vdev_memory()`, is added to
abstract this access pattern.
Previous implementation was using a marker at the end of the memory
regions array. We can now uses all the slots.
Signed-off-by: Laurent Vivier
The packet pool was previously limited to handling packets contained
within a single buffer.
This patch extends the packet pool to support iovec array,
allowing a single logical packet to be composed of multiple iovec.
To accommodate this, the storage format within the pool is modified.
For a multi-vector packet, a header entry is now stored first with
iov_base = NULL and iov_len holding the number of subsequent
vectors. The actual data vectors are then stored in the following
pool slots.
The packet_add_do() and packet_get_do() functions are updated to
manage this new format for storing and retrieving packets. The
pool_full() check is also adjusted to ensure there is enough
space for all vectors of a new packet before adding it.
Signed-off-by: Laurent Vivier
On Tue, Sep 02, 2025 at 09:52:53AM +0200, Laurent Vivier wrote:
The packet pool was previously limited to handling packets contained within a single buffer.
This patch extends the packet pool to support iovec array, allowing a single logical packet to be composed of multiple iovec.
To accommodate this, the storage format within the pool is modified. For a multi-vector packet, a header entry is now stored first with iov_base = NULL and iov_len holding the number of subsequent vectors. The actual data vectors are then stored in the following pool slots.
The packet_add_do() and packet_get_do() functions are updated to manage this new format for storing and retrieving packets. The pool_full() check is also adjusted to ensure there is enough space for all vectors of a new packet before adding it.
Signed-off-by: Laurent Vivier
Reviewed-by: David Gibson
participants (3)
-
David Gibson
-
Laurent Vivier
-
Stefano Brivio