Although we have an abstraction for the "slow path" (DHCP, NDP) guest
bound packets, the TCP and UDP forwarding paths write directly to the
tap fd. However, it turns out how they send frames to the tap device
is more similar than it originally appears.
This series unifies the low-level tap send functions for TCP and UDP,
and makes some clean ups along the way.
This is based on my earlier outstanding series.
Changes since v1:
* Abstract tap header construction as well as frame send (a number of
new patches)
* Remove unneeded flags buf_bytes globals as well
* Fix bug where we weren't correctly setting iov_len after the move
to giving variable sized iovecs to send_frames().
David Gibson (18):
pcap: Introduce pcap_frame() helper
pcap: Replace pcapm() with pcap_multiple()
tcp: Combine two parts of passt tap send path together
tcp: Don't keep compute total bytes in a message until we need it
tcp: Improve interface to tcp_l2_buf_flush()
tcp: Combine two parts of pasta tap send path together
tap, tcp: Move tap send path to tap.c
util: Introduce hton*_constant() in place of #ifdefs
tcp, udp: Use named field initializers in iov_init functions
util: Parameterize ethernet header initializer macro
tcp: Remove redundant and incorrect initialization from *_iov_init()
tcp: Consolidate calculation of total frame size
tap: Add "tap headers" abstraction
tcp: Use abstracted tap header
tap: Use different io vector bases depending on tap type
udp: Use abstracted tap header
udp: Use tap_send_frames()
tap: Improve handling of partial frame sends
dhcpv6.c | 50 +++--------
pcap.c | 78 +++++------------
pcap.h | 3 +-
tap.c | 123 +++++++++++++++++++++++++++
tap.h | 54 ++++++++++++
tcp.c | 254 +++++++++++++------------------------------------------
udp.c | 213 ++++++----------------------------------------
udp.h | 2 +-
util.h | 47 ++--------
9 files changed, 303 insertions(+), 521 deletions(-)
--
2.38.1
At present, the UDP "splice" and "tap" paths are quite separate. We
have separate sockets to receive packets bound for the tap and splice
paths. This leads to some code duplication, and extra open sockets.
This series partially unifies the two paths, allowing us to use a
single (host side) socket, bound to 0.0.0.0 or :: to receive packets
for both cases.
This is based on my earlier series with some fixes for the tap path.
David Gibson (8):
udp: Move sending pasta tap frames to the end of udp_sock_handler()
udp: Split sending to passt tap interface into separate function
udp: Split receive from preparation and send in udp_sock_handler()
udp: Receive multiple datagrams at once on the pasta sock->tap path
udp: Pre-populate msg_names with local address
udp: Unify udp_sock_handler_splice() with udp_sock_handler()
udp: Decide whether to "splice" per datagram rather than per socket
udp: Don't use separate sockets to listen for spliced packets
udp.c | 382 ++++++++++++++++++++++++++++++---------------------------
udp.h | 2 +-
util.h | 7 ++
3 files changed, 207 insertions(+), 184 deletions(-)
--
2.38.1
At present, the UDP "splice" and "tap" paths are quite separate. We
have separate sockets to receive packets bound for the tap and splice
paths. This leads to some code duplication, and extra open sockets.
This series partially unifies the two paths, allowing us to use a
single (host side) socket, bound to 0.0.0.0 or :: to receive packets
for both cases.
This is based on my earlier series with some fixes for the tap path.
Changes since v1:
* Renamed udp_localname[46] to udp[46]_localname
* Allow handling of UDP port 0
* Fix a bug which could misidentify certain v6 packets as v4-spliceable
* Some minor cosmetic fixes to code and commit messages
David Gibson (8):
udp: Move sending pasta tap frames to the end of udp_sock_handler()
udp: Split sending to passt tap interface into separate function
udp: Split receive from preparation and send in udp_sock_handler()
udp: Receive multiple datagrams at once on the pasta sock->tap path
udp: Pre-populate msg_names with local address
udp: Unify udp_sock_handler_splice() with udp_sock_handler()
udp: Decide whether to "splice" per datagram rather than per socket
udp: Don't use separate sockets to listen for spliced packets
udp.c | 380 ++++++++++++++++++++++++++++++---------------------------
udp.h | 2 +-
util.h | 7 ++
3 files changed, 205 insertions(+), 184 deletions(-)
--
2.38.1
Although we have an abstraction for the "slow path" (DHCP, NDP) guest
bound packets, the TCP and UDP forwarding paths write directly to the
tap fd. However, it turns out how they send frames to the tap device
is more similar than it originally appears.
This series unifies the low-level tap send functions for TCP and UDP,
and makes some clean ups along the way.
David Gibson (10):
pcap: Introduce pcap_frame() helper
pcap: Replace pcapm() with pcap_multiple()
tcp: Combine two parts of passt tap send path together
tcp: Don't keep compute total bytes in a message until we need it
tcp: Improve interface to tcp_l2_buf_flush()
tcp: Combine two parts of pasta tap send path together
tap, tcp: Move tap send path to tap.c
tcp,tap: Use different io vector bases depending on tap type
udp: Use tap_send_frames()
tap: Improve handling of partial frame sends
pcap.c | 78 ++++++++-----------------------
pcap.h | 3 +-
tap.c | 108 ++++++++++++++++++++++++++++++++++++++++++
tap.h | 1 +
tcp.c | 145 +++++++++++++--------------------------------------------
udp.c | 145 +++------------------------------------------------------
udp.h | 2 +-
7 files changed, 169 insertions(+), 313 deletions(-)
--
2.38.1
It turns out a couple of places on the IPv4 specific inbound path
accidentally use control structures that are supposed to be for IPv6.
That could lead to weird behaviour in a rather complex set of
circumstances.
Path 1/4 here is the actual fix, the rest makes some clean ups to the
code that should make similar mistakes harder errors harder to commit
in future.
This is based on my earlier cleanup of the UDP splicing code, although
I think it will rebase trivially.
David Gibson (4):
udp: Fix inorrect use of IPv6 mh buffers in IPv4 path
udp: Better factor IPv4 and IPv6 paths in udp_sock_handler()
udp: Preadjust udp[46]_l2_iov_tap[].iov_base for pasta mode
udp: Factor out control structure management from
udp_sock_fill_data_v[46]
udp.c | 184 ++++++++++++++++++++++++++--------------------------------
1 file changed, 81 insertions(+), 103 deletions(-)
--
2.38.1
The UDP "splicing" (forwarding packets from one L4 socket to another,
rather than via the tuntap device) code assumes that any given UDP
port in the init namespace will only communicate with a single port on
the ns side at a time, and vice versa. This will often be the case,
but since UDP is a connectionless protocol, it need not be. In fact
it is not the case in our existing UDP bandwidth checks, although the
specific configuration there means it's not harmful in that case.
The failure mode in this case can be quite bad: we don't just fall
back to an unoptimized oath, or drop packets, we will misdirect
packets to the wrong destination.
This series make some substantial simplifications to how we handle the
splice forwarding, then corrects it to handle the case of multiple
source ports sending to a single destination.
This does come at a performance cost. It's not as large as I feared,
and shouldn't affect the most common case where there is a 1 to 1
mapping between source and destination ports. I haven't yet been able
to confirm the latter because the iperf3 bandwidth test we use *does*
have interleaved streams with a common destination port.
Based on the earlier series for dual stack TCP sockets.
Changes since v3:
* Changed interface of udp_splice_sendfrom() to slightly better
separate concerns and to make some future cleanups simpler
* Fixed a serious buffer overrun bug where we weren't bounds checking
as we scanned for additional datagrams with the same source
address.
Changes since v2:
* Minor style and comment revisions
Changes since v1:
* Added patches 12..16/16 fixing the delivery of packets, as well as
just simplifying the mechanics
David Gibson (16):
udp: Also bind() connected ports for "splice" forwarding
udp: Separate tracking of inbound and outbound packet flows
udp: Always use sendto() rather than send() for forwarding spliced
packets
udp: Don't connect "forward" sockets for spliced flows
udp: Remove the @bound field from union udp_epoll_ref
udp: Split splice field in udp_epoll_ref into (mostly) independent
bits
udp: Don't create double sockets for -U port
udp: Re-use fixed bound sockets for packet forwarding when possible
udp: Don't explicitly track originating socket for spliced
"connections"
udp: Update UDP "connection" timestamps in both directions
udp: Simplify udp_sock_handler_splice
udp: Make UDP_SPLICE_FRAMES and UDP_TAP_FRAMES_MEM the same thing
udp: Add helper to extract port from a sockaddr_in or sockaddr_in6
udp: Unify buffers for tap and splice paths
udp: Split send half of udp_sock_handler_splice() from the receive
half
udp: Correct splice forwarding when receiving from multiple sources
passt.h | 2 +
udp.c | 522 ++++++++++++++++++++++++++------------------------------
udp.h | 16 +-
3 files changed, 248 insertions(+), 292 deletions(-)
--
2.38.1
The UDP "splicing" (forwarding packets from one L4 socket to another,
rather than via the tuntap device) code assumes that any given UDP
port in the init namespace will only communicate with a single port on
the ns side at a time, and vice versa. This will often be the case,
but since UDP is a connectionless protocol, it need not be. In fact
it is not the case in our existing UDP bandwidth checks, although the
specific configuration there means it's not harmful in that case.
The failure mode in this case can be quite bad: we don't just fall
back to an unoptimized oath, or drop packets, we will misdirect
packets to the wrong destination.
This series make some substantial simplifications to how we handle the
splice forwarding, then corrects it to handle the case of multiple
source ports sending to a single destination.
This does come at a performance cost. It's not as large as I feared,
and shouldn't affect the most common case where there is a 1 to 1
mapping between source and destination ports. I haven't yet been able
to confirm the latter because the iperf3 bandwidth test we use *does*
have interleaved streams with a common destination port.
Based on the earlier series for dual stack TCP sockets.
Changes since v1:
* Added patches 12..16/16 fixing the delivery of packets, as well as
just simplifying the mechanics
David Gibson (16):
udp: Also bind() connected ports for "splice" forwarding
udp: Separate tracking of inbound and outbound packet flows
udp: Always use sendto() rather than send() for forwarding spliced
packets
udp: Don't connect "forward" sockets for spliced flows
udp: Remove the @bound field from union udp_epoll_ref
udp: Split splice field in udp_epoll_ref into (mostly) independent
bits
udp: Don't create double sockets for -U port
udp: Re-use fixed bound sockets for packet forwarding when possible
udp: Don't explicitly track originating socket for spliced
"connections"
udp: Update UDP "connection" timestamps in both directions
udp: Simplify udp_sock_handler_splice
udp: Make UDP_SPLICE_FRAMES and UDP_TAP_FRAMES_MEM the same thing
udp: Add helper to extract port from a sockaddr_in or sockaddr_in6
udp: Unify buffers for tap and splice paths
udp: Split send half of udp_sock_handler_splice() from the receive
half
udp: Correct splice forwarding when receiving from multiple sources
passt.h | 2 +
udp.c | 518 +++++++++++++++++++++++++-------------------------------
udp.h | 16 +-
3 files changed, 244 insertions(+), 292 deletions(-)
--
2.38.1