The main packet "fast paths" for UDP and TCP mostly just forward packets rather than generating them from scratch. However the control paths for ICMP and DHCP sometimes generate packets more or less from scratch. Because these are relatively rare, it's not performance critical. The paths for sending these packets have some duplication of the header generation. There's also some layering violation in tap_ip_send() which both generates IP headers and updates the L4 (UDP or UCMP) checksum. Finally that checksum generation is a little awkward: it temporarily generates the IP pseudo header (or something close enough to serve) in the place of the actual header, generates the checksum, then replaces it with the real IP header. This approach seems to be causing miscompiles with some LTO optimization, because the stores to the pseudo header are being moved or elided across the code calculating the checksum. This series addresses all of these. We consolidate and clarify the packet sending helpers, and use them in some places there was previously duplicated code. In the process we use new checksum generation helpers which take a different approach which should avoid the LTO problems (this aspect I haven't tested yet though). David Gibson (14): Add csum_icmp6() helper for calculating ICMPv6 checksums Add csum_icmp4() helper for calculating ICMPv4 checksums Add csum_udp6() helper for calculating UDP over IPv6 checksums Add csum_udp4() helper for calculating UDP over IPv4 checksums Add csum_ip4_header() helper to calculate IPv4 header checksums Add helpers for normal inbound packet destination addresses Remove support for TCP packets from tap_ip_send() tap: Remove unhelpeful vnet_pre optimization from tap_send() Split tap_ip_send() into IPv4 and IPv6 specific functions tap: Split tap_ip6_send() into UDP and ICMP variants ndp: Remove unneeded eh_source parameter ndp: Use tap_icmp6_send() helper tap: Split tap_ip4_send() into UDP and ICMP variants dhcp: Use tap_udp4_send() helper in dhcp() arp.c | 2 +- checksum.c | 122 ++++++++++++++++++----- checksum.h | 19 +++- dhcp.c | 19 +--- dhcpv6.c | 21 +--- icmp.c | 12 +-- ndp.c | 28 +----- ndp.h | 3 +- tap.c | 286 ++++++++++++++++++++++++++++++++--------------------- tap.h | 19 +++- 10 files changed, 323 insertions(+), 208 deletions(-) -- 2.37.3
At least two places in passt calculate ICMPv6 checksums, ndp() and tap_ip_send(). Add a helper to handle this calculation in both places. For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the ICMPv6 header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 27 +++++++++++++++++++++++++++ checksum.h | 7 +++++++ ndp.c | 5 +---- tap.c | 6 ++---- 4 files changed, 37 insertions(+), 8 deletions(-) diff --git a/checksum.c b/checksum.c index 56ad01e..0e207c8 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,8 @@ #include <stddef.h> #include <stdint.h> +#include <linux/icmpv6.h> + /** * sum_16b() - Calculate sum of 16-bit words * @buf: Input buffer @@ -105,6 +107,31 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_icmp6() - Calculate checksum for an ICMPv6 packet + * @icmp6hr: ICMPv6 header, initialized apart from checksum + * @saddr: IPv6 source address + * @daddr: IPv6 destination address + * @payload: ICMP packet payload + * @len: Length of @payload (not including ICMPv6 header) + */ +void csum_icmp6(struct icmp6hdr *icmp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, + size_t len) +{ + /* Partial checksum for the pseudo-IPv6 header */ + uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + + sum_16b(daddr, sizeof(*daddr)) + + htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6); + + icmp6hr->icmp6_cksum = 0; + /* Add in partial checksum for the ICMPv6 header alone */ + psum += sum_16b(icmp6hr, sizeof(*icmp6hr)); + icmp6hr->icmp6_cksum = csum_unaligned(payload, len, psum); +} + /** * csum_tcp4() - Calculate TCP checksum for IPv4 and set in place * @iph: Packet buffer, IP header diff --git a/checksum.h b/checksum.h index 5418406..2c72200 100644 --- a/checksum.h +++ b/checksum.h @@ -6,9 +6,16 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct icmp6hdr; + uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_icmp6(struct icmp6hdr *ih, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, + size_t len); void csum_tcp4(struct iphdr *iph); uint16_t csum(const void *buf, size_t len, uint32_t init); diff --git a/ndp.c b/ndp.c index dec36a9..03f1d06 100644 --- a/ndp.c +++ b/ndp.c @@ -189,10 +189,7 @@ dns_done: ip6hr->saddr = c->ip6.addr_ll; ip6hr->payload_len = htons(sizeof(*ihr) + len); - ip6hr->hop_limit = IPPROTO_ICMPV6; - ihr->icmp6_cksum = 0; - ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + - sizeof(*ihr) + len, 0); + csum_icmp6(ihr, &ip6hr->saddr, &ip6hr->daddr, ihr + 1, len); ip6hr->version = 6; ip6hr->nexthdr = IPPROTO_ICMPV6; diff --git a/tap.c b/tap.c index 8b6d9bc..aafc92b 100644 --- a/tap.c +++ b/tap.c @@ -191,10 +191,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, } else if (proto == IPPROTO_ICMPV6) { struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1); - ih->icmp6_cksum = 0; - ih->icmp6_cksum = csum_unaligned(ip6h, - len + sizeof(*ip6h), - 0); + csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr, + ih + 1, len - sizeof(*ih)); } ip6h->version = 6; ip6h->nexthdr = proto; -- 2.37.3
On Mon, 17 Oct 2022 19:57:54 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:At least two places in passt calculate ICMPv6 checksums, ndp() and tap_ip_send(). Add a helper to handle this calculation in both places. For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the ICMPv6 header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 27 +++++++++++++++++++++++++++ checksum.h | 7 +++++++ ndp.c | 5 +---- tap.c | 6 ++---- 4 files changed, 37 insertions(+), 8 deletions(-) diff --git a/checksum.c b/checksum.c index 56ad01e..0e207c8 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,8 @@ #include <stddef.h> #include <stdint.h> +#include <linux/icmpv6.h> + /** * sum_16b() - Calculate sum of 16-bit words * @buf: Input buffer @@ -105,6 +107,31 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_icmp6() - Calculate checksum for an ICMPv6 packet"Calculate and set" ...?+ * @icmp6hr: ICMPv6 header, initialized apart from checksum + * @saddr: IPv6 source address + * @daddr: IPv6 destination address + * @payload: ICMP packet payload + * @len: Length of @payload (not including ICMPv6 header) + */ +void csum_icmp6(struct icmp6hdr *icmp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr,I think: const struct in6_addr *saddr, const struct in6_addr *daddr, would be easier on eyes.+ const void *payload, + size_t len) +{ + /* Partial checksum for the pseudo-IPv6 header */ + uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + + sum_16b(daddr, sizeof(*daddr)) + + htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6);Maybe: uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + sum_16b(daddr, sizeof(*daddr)) + htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6); for me, it turns things from "sum a bunch of things" to "addresses and something else".+ + icmp6hr->icmp6_cksum = 0; + /* Add in partial checksum for the ICMPv6 header alone */ + psum += sum_16b(icmp6hr, sizeof(*icmp6hr)); + icmp6hr->icmp6_cksum = csum_unaligned(payload, len, psum); +} + /** * csum_tcp4() - Calculate TCP checksum for IPv4 and set in place * @iph: Packet buffer, IP header diff --git a/checksum.h b/checksum.h index 5418406..2c72200 100644 --- a/checksum.h +++ b/checksum.h @@ -6,9 +6,16 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct icmp6hdr; + uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_icmp6(struct icmp6hdr *ih, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, + size_t len);It looks a bit like Haskell. ;) I would really use the horizontal space we have.void csum_tcp4(struct iphdr *iph); uint16_t csum(const void *buf, size_t len, uint32_t init); diff --git a/ndp.c b/ndp.c index dec36a9..03f1d06 100644 --- a/ndp.c +++ b/ndp.c @@ -189,10 +189,7 @@ dns_done: ip6hr->saddr = c->ip6.addr_ll; ip6hr->payload_len = htons(sizeof(*ihr) + len); - ip6hr->hop_limit = IPPROTO_ICMPV6; - ihr->icmp6_cksum = 0; - ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + - sizeof(*ihr) + len, 0); + csum_icmp6(ihr, &ip6hr->saddr, &ip6hr->daddr, ihr + 1, len);Nice to see this all going away!ip6hr->version = 6; ip6hr->nexthdr = IPPROTO_ICMPV6; diff --git a/tap.c b/tap.c index 8b6d9bc..aafc92b 100644 --- a/tap.c +++ b/tap.c @@ -191,10 +191,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, } else if (proto == IPPROTO_ICMPV6) { struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1); - ih->icmp6_cksum = 0; - ih->icmp6_cksum = csum_unaligned(ip6h, - len + sizeof(*ip6h), - 0); + csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr, + ih + 1, len - sizeof(*ih)); } ip6h->version = 6; ip6h->nexthdr = proto;-- Stefano
On Tue, Oct 18, 2022 at 05:01:01AM +0200, Stefano Brivio wrote:On Mon, 17 Oct 2022 19:57:54 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Done.At least two places in passt calculate ICMPv6 checksums, ndp() and tap_ip_send(). Add a helper to handle this calculation in both places. For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the ICMPv6 header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 27 +++++++++++++++++++++++++++ checksum.h | 7 +++++++ ndp.c | 5 +---- tap.c | 6 ++---- 4 files changed, 37 insertions(+), 8 deletions(-) diff --git a/checksum.c b/checksum.c index 56ad01e..0e207c8 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,8 @@ #include <stddef.h> #include <stdint.h> +#include <linux/icmpv6.h> + /** * sum_16b() - Calculate sum of 16-bit words * @buf: Input buffer @@ -105,6 +107,31 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_icmp6() - Calculate checksum for an ICMPv6 packet"Calculate and set" ...?Done. Not sure why I did it that way in the first place.+ * @icmp6hr: ICMPv6 header, initialized apart from checksum + * @saddr: IPv6 source address + * @daddr: IPv6 destination address + * @payload: ICMP packet payload + * @len: Length of @payload (not including ICMPv6 header) + */ +void csum_icmp6(struct icmp6hdr *icmp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr,I think: const struct in6_addr *saddr, const struct in6_addr *daddr, would be easier on eyes.Fair enough, done.+ const void *payload, + size_t len) +{ + /* Partial checksum for the pseudo-IPv6 header */ + uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + + sum_16b(daddr, sizeof(*daddr)) + + htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6);Maybe: uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + sum_16b(daddr, sizeof(*daddr)) + htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6); for me, it turns things from "sum a bunch of things" to "addresses and something else".-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson+ + icmp6hr->icmp6_cksum = 0; + /* Add in partial checksum for the ICMPv6 header alone */ + psum += sum_16b(icmp6hr, sizeof(*icmp6hr)); + icmp6hr->icmp6_cksum = csum_unaligned(payload, len, psum); +} + /** * csum_tcp4() - Calculate TCP checksum for IPv4 and set in place * @iph: Packet buffer, IP header diff --git a/checksum.h b/checksum.h index 5418406..2c72200 100644 --- a/checksum.h +++ b/checksum.h @@ -6,9 +6,16 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct icmp6hdr; + uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_icmp6(struct icmp6hdr *ih, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, + size_t len);It looks a bit like Haskell. ;) I would really use the horizontal space we have.void csum_tcp4(struct iphdr *iph); uint16_t csum(const void *buf, size_t len, uint32_t init); diff --git a/ndp.c b/ndp.c index dec36a9..03f1d06 100644 --- a/ndp.c +++ b/ndp.c @@ -189,10 +189,7 @@ dns_done: ip6hr->saddr = c->ip6.addr_ll; ip6hr->payload_len = htons(sizeof(*ihr) + len); - ip6hr->hop_limit = IPPROTO_ICMPV6; - ihr->icmp6_cksum = 0; - ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) + - sizeof(*ihr) + len, 0); + csum_icmp6(ihr, &ip6hr->saddr, &ip6hr->daddr, ihr + 1, len);Nice to see this all going away!ip6hr->version = 6; ip6hr->nexthdr = IPPROTO_ICMPV6; diff --git a/tap.c b/tap.c index 8b6d9bc..aafc92b 100644 --- a/tap.c +++ b/tap.c @@ -191,10 +191,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, } else if (proto == IPPROTO_ICMPV6) { struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1); - ih->icmp6_cksum = 0; - ih->icmp6_cksum = csum_unaligned(ip6h, - len + sizeof(*ip6h), - 0); + csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr, + ih + 1, len - sizeof(*ih)); } ip6h->version = 6; ip6h->nexthdr = proto;
Although tap_ip_send() is currently the only place calculating ICMPv4 checksums, create a helper function for symmetry with ICMPv6. For future flexibility it allows the ICMPv6 header and payload to be in separate buffers. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 15 +++++++++++++++ checksum.h | 2 ++ tap.c | 4 +--- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/checksum.c b/checksum.c index 0e207c8..c8b6b42 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,7 @@ #include <stddef.h> #include <stdint.h> +#include <linux/icmp.h> #include <linux/icmpv6.h> /** @@ -107,6 +108,20 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_icmp4() - Calculate checksum for an ICMPv4 packet + * @icmp4hr: ICMPv4 header, initialized apart from checksum + * @payload: ICMPv4 packet payload + * @len: Length of @payload (not including ICMPv4 header) + */ +void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len) +{ + /* Partial checksum for ICMPv4 header alone */ + uint32_t hrsum = sum_16b(icmp4hr, sizeof(*icmp4hr)); + icmp4hr->checksum = 0; + icmp4hr->checksum = csum_unaligned(payload, len, hrsum); +} + /** * csum_icmp6() - Calculate checksum for an ICMPv6 packet * @icmp6hr: ICMPv6 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index 2c72200..ff95cf9 100644 --- a/checksum.h +++ b/checksum.h @@ -6,11 +6,13 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct icmphdr; struct icmp6hdr; uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len); void csum_icmp6(struct icmp6hdr *ih, const struct in6_addr *saddr, const struct in6_addr *daddr, diff --git a/tap.c b/tap.c index aafc92b..f082901 100644 --- a/tap.c +++ b/tap.c @@ -148,9 +148,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, uh->check = 0; } else if (iph->protocol == IPPROTO_ICMP) { struct icmphdr *ih = (struct icmphdr *)(iph + 1); - - ih->checksum = 0; - ih->checksum = csum_unaligned(ih, len, 0); + csum_icmp4(ih, ih + 1, len - sizeof(*ih)); } if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh), 1) < 0) -- 2.37.3
On Mon, 17 Oct 2022 19:57:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Although tap_ip_send() is currently the only place calculating ICMPv4 checksums, create a helper function for symmetry with ICMPv6. For future flexibility it allows the ICMPv6 header and payload to be in separate buffers. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 15 +++++++++++++++ checksum.h | 2 ++ tap.c | 4 +--- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/checksum.c b/checksum.c index 0e207c8..c8b6b42 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,7 @@ #include <stddef.h> #include <stdint.h> +#include <linux/icmp.h> #include <linux/icmpv6.h> /** @@ -107,6 +108,20 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_icmp4() - Calculate checksum for an ICMPv4 packet"Calculate and set"? By the way, there's no such thing as ICMPv4 -- it's ICMP.+ * @icmp4hr: ICMPv4 header, initialized apart from checksum...-ised, if you respin. For consistency, I would call this 'ih'.+ * @payload: ICMPv4 packet payload + * @len: Length of @payload (not including ICMPv4 header) + */ +void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)I guess csum_icmp() is preferable. Indeed, for TCP and UDP 'tcp4' and 'udp4' make sense because those are the same protocols over IPv4 and IPv6.+{ + /* Partial checksum for ICMPv4 header alone */ + uint32_t hrsum = sum_16b(icmp4hr, sizeof(*icmp4hr));A white line would be nice here. I would call this psum (same as in csum_icmp6()) or hdrsum, 'hr' isn't really used for "header" elsewhere.+ icmp4hr->checksum = 0; + icmp4hr->checksum = csum_unaligned(payload, len, hrsum); +} + /** * csum_icmp6() - Calculate checksum for an ICMPv6 packet * @icmp6hr: ICMPv6 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index 2c72200..ff95cf9 100644 --- a/checksum.h +++ b/checksum.h @@ -6,11 +6,13 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct icmphdr; struct icmp6hdr; uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len); void csum_icmp6(struct icmp6hdr *ih, const struct in6_addr *saddr, const struct in6_addr *daddr, diff --git a/tap.c b/tap.c index aafc92b..f082901 100644 --- a/tap.c +++ b/tap.c @@ -148,9 +148,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, uh->check = 0; } else if (iph->protocol == IPPROTO_ICMP) { struct icmphdr *ih = (struct icmphdr *)(iph + 1); - - ih->checksum = 0; - ih->checksum = csum_unaligned(ih, len, 0); + csum_icmp4(ih, ih + 1, len - sizeof(*ih)); } if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh), 1) < 0)-- Stefano
On Tue, Oct 18, 2022 at 05:01:51AM +0200, Stefano Brivio wrote:On Mon, 17 Oct 2022 19:57:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Done.Although tap_ip_send() is currently the only place calculating ICMPv4 checksums, create a helper function for symmetry with ICMPv6. For future flexibility it allows the ICMPv6 header and payload to be in separate buffers. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 15 +++++++++++++++ checksum.h | 2 ++ tap.c | 4 +--- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/checksum.c b/checksum.c index 0e207c8..c8b6b42 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,7 @@ #include <stddef.h> #include <stdint.h> +#include <linux/icmp.h> #include <linux/icmpv6.h> /** @@ -107,6 +108,20 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_icmp4() - Calculate checksum for an ICMPv4 packet"Calculate and set"?By the way, there's no such thing as ICMPv4 -- it's ICMP.Technically, yes, but I kind of wanted to make it clear at a glance that these are IPv4 specific functions. I'd also like to avoid the implication that v4 is the "normal" sort. I've changed from "ICMPv4" to "ICMP" in the comments, but I've left the '4's in the various namesSee above.+ * @icmp4hr: ICMPv4 header, initialized apart from checksum...-ised, if you respin. For consistency, I would call this 'ih'.+ * @payload: ICMPv4 packet payload + * @len: Length of @payload (not including ICMPv4 header) + */ +void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)I guess csum_icmp() is preferable. Indeed, for TCP and UDP 'tcp4' and 'udp4' make sense because those are the same protocols over IPv4 and IPv6.Done.+{ + /* Partial checksum for ICMPv4 header alone */ + uint32_t hrsum = sum_16b(icmp4hr, sizeof(*icmp4hr));A white line would be nice here.I would call this psum (same as in csum_icmp6())Changed to 'psum'.or hdrsum, 'hr' isn't really used for "header" elsewhere.Well.. except as a suffix, 'ihr' etc.-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson+ icmp4hr->checksum = 0; + icmp4hr->checksum = csum_unaligned(payload, len, hrsum); +} + /** * csum_icmp6() - Calculate checksum for an ICMPv6 packet * @icmp6hr: ICMPv6 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index 2c72200..ff95cf9 100644 --- a/checksum.h +++ b/checksum.h @@ -6,11 +6,13 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct icmphdr; struct icmp6hdr; uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len); void csum_icmp6(struct icmp6hdr *ih, const struct in6_addr *saddr, const struct in6_addr *daddr, diff --git a/tap.c b/tap.c index aafc92b..f082901 100644 --- a/tap.c +++ b/tap.c @@ -148,9 +148,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, uh->check = 0; } else if (iph->protocol == IPPROTO_ICMP) { struct icmphdr *ih = (struct icmphdr *)(iph + 1); - - ih->checksum = 0; - ih->checksum = csum_unaligned(ih, len, 0); + csum_icmp4(ih, ih + 1, len - sizeof(*ih)); } if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh), 1) < 0)
On Tue, 18 Oct 2022 23:06:11 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Tue, Oct 18, 2022 at 05:01:51AM +0200, Stefano Brivio wrote:Ah, yes, sure, makes sense, as long as we don't refer to "ICMPv4" in the comments I'm fine with it. :) -- StefanoOn Mon, 17 Oct 2022 19:57:55 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Done.Although tap_ip_send() is currently the only place calculating ICMPv4 checksums, create a helper function for symmetry with ICMPv6. For future flexibility it allows the ICMPv6 header and payload to be in separate buffers. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 15 +++++++++++++++ checksum.h | 2 ++ tap.c | 4 +--- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/checksum.c b/checksum.c index 0e207c8..c8b6b42 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,7 @@ #include <stddef.h> #include <stdint.h> +#include <linux/icmp.h> #include <linux/icmpv6.h> /** @@ -107,6 +108,20 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_icmp4() - Calculate checksum for an ICMPv4 packet"Calculate and set"?By the way, there's no such thing as ICMPv4 -- it's ICMP.Technically, yes, but I kind of wanted to make it clear at a glance that these are IPv4 specific functions. I'd also like to avoid the implication that v4 is the "normal" sort. I've changed from "ICMPv4" to "ICMP" in the comments, but I've left the '4's in the various names
Add a helper for calculating UDP checksums when used over IPv6 For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the UDP header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 23 +++++++++++++++++++++++ checksum.h | 5 +++++ tap.c | 5 ++--- 3 files changed, 30 insertions(+), 3 deletions(-) diff --git a/checksum.c b/checksum.c index c8b6b42..0849fb1 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,7 @@ #include <stddef.h> #include <stdint.h> +#include <linux/udp.h> #include <linux/icmp.h> #include <linux/icmpv6.h> @@ -122,6 +123,28 @@ void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len) icmp4hr->checksum = csum_unaligned(payload, len, hrsum); } +/** + * csum_udp6() - Calculate checksum for a UDP over IPv6 packet + * @udp6hr: UDP header, initialized apart from checksum + * @payload: UDP packet payload + * @len: Length of @payload (not including UDP header) + */ +void csum_udp6(struct udphdr *udp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, size_t len) +{ + /* Partial checksum for the pseudo-IPv6 header */ + uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + + sum_16b(daddr, sizeof(*daddr)) + + htons(len + sizeof(*udp6hr)) + htons(IPPROTO_UDP); + + udp6hr->check = 0; + /* Add in partial checksum for the UDP header alone */ + psum += sum_16b(udp6hr, sizeof(*udp6hr)); + udp6hr->check = csum_unaligned(payload, len, psum); +} + /** * csum_icmp6() - Calculate checksum for an ICMPv6 packet * @icmp6hr: ICMPv6 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index ff95cf9..1b9f48e 100644 --- a/checksum.h +++ b/checksum.h @@ -6,6 +6,7 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct udphdr; struct icmphdr; struct icmp6hdr; @@ -13,6 +14,10 @@ uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len); +void csum_udp6(struct udphdr *udp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, size_t len); void csum_icmp6(struct icmp6hdr *ih, const struct in6_addr *saddr, const struct in6_addr *daddr, diff --git a/tap.c b/tap.c index f082901..9c197cb 100644 --- a/tap.c +++ b/tap.c @@ -183,9 +183,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, } else if (proto == IPPROTO_UDP) { struct udphdr *uh = (struct udphdr *)(ip6h + 1); - uh->check = 0; - uh->check = csum_unaligned(ip6h, len + sizeof(*ip6h), - 0); + csum_udp6(uh, &ip6h->saddr, &ip6h->daddr, + uh + 1, len - sizeof(*uh)); } else if (proto == IPPROTO_ICMPV6) { struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1); -- 2.37.3
On Mon, 17 Oct 2022 19:57:56 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Add a helper for calculating UDP checksums when used over IPv6 For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the UDP header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 23 +++++++++++++++++++++++ checksum.h | 5 +++++ tap.c | 5 ++--- 3 files changed, 30 insertions(+), 3 deletions(-) diff --git a/checksum.c b/checksum.c index c8b6b42..0849fb1 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,7 @@ #include <stddef.h> #include <stdint.h> +#include <linux/udp.h> #include <linux/icmp.h> #include <linux/icmpv6.h> @@ -122,6 +123,28 @@ void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len) icmp4hr->checksum = csum_unaligned(payload, len, hrsum); } +/** + * csum_udp6() - Calculate checksum for a UDP over IPv6 packetCalculate and set.+ * @udp6hr: UDP header, initialized apart from checksum-ised.+ * @payload: UDP packet payload + * @len: Length of @payload (not including UDP header) + */ +void csum_udp6(struct udphdr *udp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr,You could use some horizontal space.+ const void *payload, size_t len) +{ + /* Partial checksum for the pseudo-IPv6 header */ + uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + + sum_16b(daddr, sizeof(*daddr)) + + htons(len + sizeof(*udp6hr)) + htons(IPPROTO_UDP);Alignment: uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + sum_16b(daddr, sizeof(*daddr)) + htons(len + sizeof(*udp6hr)) + htons(IPPROTO_UDP);+ udp6hr->check = 0; + /* Add in partial checksum for the UDP header alone */ + psum += sum_16b(udp6hr, sizeof(*udp6hr)); + udp6hr->check = csum_unaligned(payload, len, psum); +} + /** * csum_icmp6() - Calculate checksum for an ICMPv6 packet * @icmp6hr: ICMPv6 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index ff95cf9..1b9f48e 100644 --- a/checksum.h +++ b/checksum.h @@ -6,6 +6,7 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct udphdr; struct icmphdr; struct icmp6hdr; @@ -13,6 +14,10 @@ uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len); +void csum_udp6(struct udphdr *udp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, size_t len);Use some horizontal space.void csum_icmp6(struct icmp6hdr *ih, const struct in6_addr *saddr, const struct in6_addr *daddr, diff --git a/tap.c b/tap.c index f082901..9c197cb 100644 --- a/tap.c +++ b/tap.c @@ -183,9 +183,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, } else if (proto == IPPROTO_UDP) { struct udphdr *uh = (struct udphdr *)(ip6h + 1); - uh->check = 0; - uh->check = csum_unaligned(ip6h, len + sizeof(*ip6h), - 0); + csum_udp6(uh, &ip6h->saddr, &ip6h->daddr, + uh + 1, len - sizeof(*uh)); } else if (proto == IPPROTO_ICMPV6) { struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);-- Stefano
On Tue, Oct 18, 2022 at 05:02:31AM +0200, Stefano Brivio wrote:On Mon, 17 Oct 2022 19:57:56 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Done.Add a helper for calculating UDP checksums when used over IPv6 For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the UDP header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 23 +++++++++++++++++++++++ checksum.h | 5 +++++ tap.c | 5 ++--- 3 files changed, 30 insertions(+), 3 deletions(-) diff --git a/checksum.c b/checksum.c index c8b6b42..0849fb1 100644 --- a/checksum.c +++ b/checksum.c @@ -52,6 +52,7 @@ #include <stddef.h> #include <stdint.h> +#include <linux/udp.h> #include <linux/icmp.h> #include <linux/icmpv6.h> @@ -122,6 +123,28 @@ void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len) icmp4hr->checksum = csum_unaligned(payload, len, hrsum); } +/** + * csum_udp6() - Calculate checksum for a UDP over IPv6 packetCalculate and set.Done.+ * @udp6hr: UDP header, initialized apart from checksum-ised.Done.+ * @payload: UDP packet payload + * @len: Length of @payload (not including UDP header) + */ +void csum_udp6(struct udphdr *udp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr,You could use some horizontal space.Done.+ const void *payload, size_t len) +{ + /* Partial checksum for the pseudo-IPv6 header */ + uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + + sum_16b(daddr, sizeof(*daddr)) + + htons(len + sizeof(*udp6hr)) + htons(IPPROTO_UDP);Alignment: uint32_t psum = sum_16b(saddr, sizeof(*saddr)) + sum_16b(daddr, sizeof(*daddr)) + htons(len + sizeof(*udp6hr)) + htons(IPPROTO_UDP);Done.+ udp6hr->check = 0; + /* Add in partial checksum for the UDP header alone */ + psum += sum_16b(udp6hr, sizeof(*udp6hr)); + udp6hr->check = csum_unaligned(payload, len, psum); +} + /** * csum_icmp6() - Calculate checksum for an ICMPv6 packet * @icmp6hr: ICMPv6 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index ff95cf9..1b9f48e 100644 --- a/checksum.h +++ b/checksum.h @@ -6,6 +6,7 @@ #ifndef CHECKSUM_H #define CHECKSUM_H +struct udphdr; struct icmphdr; struct icmp6hdr; @@ -13,6 +14,10 @@ uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len); +void csum_udp6(struct udphdr *udp6hr, + const struct in6_addr *saddr, + const struct in6_addr *daddr, + const void *payload, size_t len);Use some horizontal space.-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonvoid csum_icmp6(struct icmp6hdr *ih, const struct in6_addr *saddr, const struct in6_addr *daddr, diff --git a/tap.c b/tap.c index f082901..9c197cb 100644 --- a/tap.c +++ b/tap.c @@ -183,9 +183,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, } else if (proto == IPPROTO_UDP) { struct udphdr *uh = (struct udphdr *)(ip6h + 1); - uh->check = 0; - uh->check = csum_unaligned(ip6h, len + sizeof(*ip6h), - 0); + csum_udp6(uh, &ip6h->saddr, &ip6h->daddr, + uh + 1, len - sizeof(*uh)); } else if (proto == IPPROTO_ICMPV6) { struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
At least two places in passt fill in UDP over IPv4 checksums, although since UDP checksums are optional with IPv4 that just amounts to storing a 0 (in tap_ip_send()) or leaving a 0 from an earlier initialization (in dhcp()). For consistency, add a helper for this "calculation". Just for the heck of it, add the option (compile time disabled for now) to calculate real UDP checksums. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 33 +++++++++++++++++++++++++++++++++ checksum.h | 3 +++ dhcp.c | 2 +- tap.c | 2 +- 4 files changed, 38 insertions(+), 2 deletions(-) diff --git a/checksum.c b/checksum.c index 0849fb1..72f1cfb 100644 --- a/checksum.c +++ b/checksum.c @@ -56,6 +56,11 @@ #include <linux/icmp.h> #include <linux/icmpv6.h> +/* Checksums are optional for UDP over IPv4, so we usually just set + * them to 0. Change this 1 to calculate real UDP over IPv4 checksums + */ +#define UDP4_REAL_CHECKSUMS 0 + /** * sum_16b() - Calculate sum of 16-bit words * @buf: Input buffer @@ -109,6 +114,34 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_udp4() - Calculate checksum for a UDP over IPv4 packet + * @udp4hr: UDP header, initialized apart from checksum + * @saddr: IPv4 source address + * @daddr: IPv4 destination address + * @payload: ICMPv4 packet payload + * @len: Length of @payload (not including UDP) + */ +void csum_udp4(struct udphdr *udp4hr, + in_addr_t saddr, in_addr_t daddr, + const void *payload, size_t len) +{ + /* UDP checksums are optional, so don't bother */ + udp4hr->check = 0; + + if (UDP4_REAL_CHECKSUMS) { + /* UNTESTED: if we did want real UDPv4 checksums, this + * is roughly what we'd need */ + uint32_t psum = csum_fold(htonl(saddr)) + + csum_fold(htonl(daddr)) + + htons(len + sizeof(*udp4hr)) + + htons(IPPROTO_UDP); + /* Add in partial checksum for the UDP header alone */ + psum += sum_16b(udp4hr, sizeof(*udp4hr)); + udp4hr->check = csum_unaligned(payload, len, psum); + } +} + /** * csum_icmp4() - Calculate checksum for an ICMPv4 packet * @icmp4hr: ICMPv4 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index 1b9f48e..a9502b9 100644 --- a/checksum.h +++ b/checksum.h @@ -13,6 +13,9 @@ struct icmp6hdr; uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_udp4(struct udphdr *udp4hr, + in_addr_t saddr, in_addr_t daddr, + const void *payload, size_t len); void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len); void csum_udp6(struct udphdr *udp6hr, const struct in6_addr *saddr, diff --git a/dhcp.c b/dhcp.c index 7f0cc0b..8dcf645 100644 --- a/dhcp.c +++ b/dhcp.c @@ -364,9 +364,9 @@ int dhcp(const struct ctx *c, const struct pool *p) opt_set_dns_search(c, sizeof(m->o)); uh->len = htons(len = offsetof(struct msg, o) + fill(m) + sizeof(*uh)); - uh->check = 0; uh->source = htons(67); uh->dest = htons(68); + csum_udp4(uh, c->ip4.gw, c->ip4.addr, uh + 1, len - sizeof(*uh)); iph->tot_len = htons(len += sizeof(*iph)); iph->daddr = c->ip4.addr; diff --git a/tap.c b/tap.c index 9c197cb..58fc1de 100644 --- a/tap.c +++ b/tap.c @@ -145,7 +145,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, } else if (iph->protocol == IPPROTO_UDP) { struct udphdr *uh = (struct udphdr *)(iph + 1); - uh->check = 0; + csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh)); } else if (iph->protocol == IPPROTO_ICMP) { struct icmphdr *ih = (struct icmphdr *)(iph + 1); csum_icmp4(ih, ih + 1, len - sizeof(*ih)); -- 2.37.3
On Mon, 17 Oct 2022 19:57:57 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:At least two places in passt fill in UDP over IPv4 checksums, although since UDP checksums are optional with IPv4 that just amounts to storing a 0 (in tap_ip_send()) or leaving a 0 from an earlier initialization (in dhcp()). For consistency, add a helper for this "calculation". Just for the heck of it, add the option (compile time disabled for now) to calculate real UDP checksums. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 33 +++++++++++++++++++++++++++++++++ checksum.h | 3 +++ dhcp.c | 2 +- tap.c | 2 +- 4 files changed, 38 insertions(+), 2 deletions(-) diff --git a/checksum.c b/checksum.c index 0849fb1..72f1cfb 100644 --- a/checksum.c +++ b/checksum.c @@ -56,6 +56,11 @@ #include <linux/icmp.h> #include <linux/icmpv6.h> +/* Checksums are optional for UDP over IPv4, so we usually just set + * them to 0. Change this 1 to calculate real UDP over IPv4 checksumsto 1+ */ +#define UDP4_REAL_CHECKSUMS 0 + /** * sum_16b() - Calculate sum of 16-bit words * @buf: Input buffer @@ -109,6 +114,34 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_udp4() - Calculate checksum for a UDP over IPv4 packetand set+ * @udp4hr: UDP header, initialized apart from checksum + * @saddr: IPv4 source address + * @daddr: IPv4 destination address + * @payload: ICMPv4 packet payload + * @len: Length of @payload (not including UDP) + */ +void csum_udp4(struct udphdr *udp4hr, + in_addr_t saddr, in_addr_t daddr, + const void *payload, size_t len) +{ + /* UDP checksums are optional, so don't bother */ + udp4hr->check = 0; + + if (UDP4_REAL_CHECKSUMS) { + /* UNTESTED: if we did want real UDPv4 checksums, this + * is roughly what we'd need */ + uint32_t psum = csum_fold(htonl(saddr)) + + csum_fold(htonl(daddr)) + + htons(len + sizeof(*udp4hr)) + + htons(IPPROTO_UDP); + /* Add in partial checksum for the UDP header alone */ + psum += sum_16b(udp4hr, sizeof(*udp4hr)); + udp4hr->check = csum_unaligned(payload, len, psum); + } +} + /** * csum_icmp4() - Calculate checksum for an ICMPv4 packet * @icmp4hr: ICMPv4 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index 1b9f48e..a9502b9 100644 --- a/checksum.h +++ b/checksum.h @@ -13,6 +13,9 @@ struct icmp6hdr; uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_udp4(struct udphdr *udp4hr, + in_addr_t saddr, in_addr_t daddr, + const void *payload, size_t len);Horizontal space. -- Stefano
On Tue, Oct 18, 2022 at 05:03:09AM +0200, Stefano Brivio wrote: 11;rgb:ffff/ffff/ffff> On Mon, 17 Oct 2022 19:57:57 +1100David Gibson <david(a)gibson.dropbear.id.au> wrote:Done.At least two places in passt fill in UDP over IPv4 checksums, although since UDP checksums are optional with IPv4 that just amounts to storing a 0 (in tap_ip_send()) or leaving a 0 from an earlier initialization (in dhcp()). For consistency, add a helper for this "calculation". Just for the heck of it, add the option (compile time disabled for now) to calculate real UDP checksums. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 33 +++++++++++++++++++++++++++++++++ checksum.h | 3 +++ dhcp.c | 2 +- tap.c | 2 +- 4 files changed, 38 insertions(+), 2 deletions(-) diff --git a/checksum.c b/checksum.c index 0849fb1..72f1cfb 100644 --- a/checksum.c +++ b/checksum.c @@ -56,6 +56,11 @@ #include <linux/icmp.h> #include <linux/icmpv6.h> +/* Checksums are optional for UDP over IPv4, so we usually just set + * them to 0. Change this 1 to calculate real UDP over IPv4 checksumsto 1Done.+ */ +#define UDP4_REAL_CHECKSUMS 0 + /** * sum_16b() - Calculate sum of 16-bit words * @buf: Input buffer @@ -109,6 +114,34 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +/** + * csum_udp4() - Calculate checksum for a UDP over IPv4 packetand setDone. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson+ * @udp4hr: UDP header, initialized apart from checksum + * @saddr: IPv4 source address + * @daddr: IPv4 destination address + * @payload: ICMPv4 packet payload + * @len: Length of @payload (not including UDP) + */ +void csum_udp4(struct udphdr *udp4hr, + in_addr_t saddr, in_addr_t daddr, + const void *payload, size_t len) +{ + /* UDP checksums are optional, so don't bother */ + udp4hr->check = 0; + + if (UDP4_REAL_CHECKSUMS) { + /* UNTESTED: if we did want real UDPv4 checksums, this + * is roughly what we'd need */ + uint32_t psum = csum_fold(htonl(saddr)) + + csum_fold(htonl(daddr)) + + htons(len + sizeof(*udp4hr)) + + htons(IPPROTO_UDP); + /* Add in partial checksum for the UDP header alone */ + psum += sum_16b(udp4hr, sizeof(*udp4hr)); + udp4hr->check = csum_unaligned(payload, len, psum); + } +} + /** * csum_icmp4() - Calculate checksum for an ICMPv4 packet * @icmp4hr: ICMPv4 header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index 1b9f48e..a9502b9 100644 --- a/checksum.h +++ b/checksum.h @@ -13,6 +13,9 @@ struct icmp6hdr; uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_udp4(struct udphdr *udp4hr, + in_addr_t saddr, in_addr_t daddr, + const void *payload, size_t len);Horizontal space.
We calculate IPv4 header checksums in at least two places, in dhcp() and in tap_ip_send. Add a helper to handle this calculation in both places. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 6 ++++++ checksum.h | 1 + dhcp.c | 3 +-- tap.c | 3 +-- 4 files changed, 9 insertions(+), 4 deletions(-) diff --git a/checksum.c b/checksum.c index 72f1cfb..f25a96a 100644 --- a/checksum.c +++ b/checksum.c @@ -114,6 +114,12 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); } +void csum_ip4_header(struct iphdr *ip4hr) +{ + ip4hr->check = 0; + ip4hr->check = csum_unaligned(ip4hr, (size_t)ip4hr->ihl * 4, 0); +} + /** * csum_udp4() - Calculate checksum for a UDP over IPv4 packet * @udp4hr: UDP header, initialized apart from checksum diff --git a/checksum.h b/checksum.h index a9502b9..bdb2ed2 100644 --- a/checksum.h +++ b/checksum.h @@ -13,6 +13,7 @@ struct icmp6hdr; uint32_t sum_16b(const void *buf, size_t len); uint16_t csum_fold(uint32_t sum); uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init); +void csum_ip4_header(struct iphdr *ip4hr); void csum_udp4(struct udphdr *udp4hr, in_addr_t saddr, in_addr_t daddr, const void *payload, size_t len); diff --git a/dhcp.c b/dhcp.c index 8dcf645..875e18b 100644 --- a/dhcp.c +++ b/dhcp.c @@ -371,8 +371,7 @@ int dhcp(const struct ctx *c, const struct pool *p) iph->tot_len = htons(len += sizeof(*iph)); iph->daddr = c->ip4.addr; iph->saddr = c->ip4.gw; - iph->check = 0; - iph->check = csum_unaligned(iph, (intptr_t)(iph->ihl * 4), 0); + csum_ip4_header(iph); len += sizeof(*eh); memcpy(eh->h_dest, eh->h_source, ETH_ALEN); diff --git a/tap.c b/tap.c index 58fc1de..de02c56 100644 --- a/tap.c +++ b/tap.c @@ -135,8 +135,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, iph->daddr = c->ip4.addr_seen; memcpy(&iph->saddr, &src->s6_addr[12], 4); - iph->check = 0; - iph->check = csum_unaligned(iph, (size_t)iph->ihl * 4, 0); + csum_ip4_header(iph); memcpy(data, in, len); -- 2.37.3
On Mon, 17 Oct 2022 19:57:58 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:We calculate IPv4 header checksums in at least two places, in dhcp() and in tap_ip_send. Add a helper to handle this calculation in both places. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 6 ++++++ checksum.h | 1 + dhcp.c | 3 +-- tap.c | 3 +-- 4 files changed, 9 insertions(+), 4 deletions(-) diff --git a/checksum.c b/checksum.c index 72f1cfb..f25a96a 100644 --- a/checksum.c +++ b/checksum.c @@ -114,6 +114,12 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); }/** * csum_ip4_header() - Calculate and set IPv4 header checksum * @iph: IPv4 header */ ...I just tried to run Doxygen, I think it would be nice to have eventually (especially for DOT call graphs), things don't look too bad.+void csum_ip4_header(struct iphdr *ip4hr) +{ + ip4hr->check = 0; + ip4hr->check = csum_unaligned(ip4hr, (size_t)ip4hr->ihl * 4, 0);iph, for consistency. -- Stefano
On Tue, Oct 18, 2022 at 05:03:49AM +0200, Stefano Brivio wrote:On Mon, 17 Oct 2022 19:57:58 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:As noted before, I'd prefer to avoid the implication that IPv4 is normal and IPv6 is special. I have changed to just ip4h, though. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonWe calculate IPv4 header checksums in at least two places, in dhcp() and in tap_ip_send. Add a helper to handle this calculation in both places. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 6 ++++++ checksum.h | 1 + dhcp.c | 3 +-- tap.c | 3 +-- 4 files changed, 9 insertions(+), 4 deletions(-) diff --git a/checksum.c b/checksum.c index 72f1cfb..f25a96a 100644 --- a/checksum.c +++ b/checksum.c @@ -114,6 +114,12 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init) return (uint16_t)~csum_fold(sum_16b(buf, len) + init); }/** * csum_ip4_header() - Calculate and set IPv4 header checksum * @iph: IPv4 header */ ...I just tried to run Doxygen, I think it would be nice to have eventually (especially for DOT call graphs), things don't look too bad.+void csum_ip4_header(struct iphdr *ip4hr) +{ + ip4hr->check = 0; + ip4hr->check = csum_unaligned(ip4hr, (size_t)ip4hr->ihl * 4, 0);iph, for consistency.
tap_ip_send() doesn't take a destination address, because it's specifically for inbound packets, and the IP addresses of the guest/namespace are already known to us. Rather than open-coding this destination address logic, make helper functions for it which will enable some later cleanups. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tap.c | 29 ++++++++++++++++++++++++----- tap.h | 3 +++ 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/tap.c b/tap.c index de02c56..41e8ff2 100644 --- a/tap.c +++ b/tap.c @@ -96,6 +96,28 @@ int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre) return write(c->fd_tap, (char *)data + (vnet_pre ? 4 : 0), len); } +/** + * tap_ip4_daddr() - Normal IPv4 destination address for inbound packets + * @c: Execution context + */ +in_addr_t tap_ip4_daddr(const struct ctx *c) +{ + return c->ip4.addr_seen; +} + +/** + * tap_ip6_daddr() - Normal IPv4 destination address for inbound packets + * @c: Execution context + * @src: Source address + */ +const struct in6_addr *tap_ip6_daddr(const struct ctx *c, + const struct in6_addr *src) +{ + if (IN6_IS_ADDR_LINKLOCAL(src)) + return &c->ip6.addr_ll_seen; + return &c->ip6.addr_seen; +} + /** * tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums * @c: Execution context @@ -132,7 +154,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, iph->frag_off = 0; iph->ttl = 255; iph->protocol = proto; - iph->daddr = c->ip4.addr_seen; + iph->daddr = tap_ip4_daddr(c); memcpy(&iph->saddr, &src->s6_addr[12], 4); csum_ip4_header(iph); @@ -163,10 +185,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, ip6h->priority = 0; ip6h->saddr = *src; - if (IN6_IS_ADDR_LINKLOCAL(src)) - ip6h->daddr = c->ip6.addr_ll_seen; - else - ip6h->daddr = c->ip6.addr_seen; + ip6h->daddr = *tap_ip6_daddr(c, src); memcpy(data, in, len); diff --git a/tap.h b/tap.h index df3aec0..a6764b4 100644 --- a/tap.h +++ b/tap.h @@ -6,6 +6,9 @@ #ifndef TAP_H #define TAP_H +in_addr_t tap_ip4_daddr(const struct ctx *c); +const struct in6_addr *tap_ip6_daddr(const struct ctx *c, + const struct in6_addr *src); void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, const char *in, size_t len, uint32_t flow); int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre); -- 2.37.3
On Mon, 17 Oct 2022 19:57:59 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:tap_ip_send() doesn't take a destination address, because it's specifically for inbound packets, and the IP addresses of the guest/namespace are already known to us. Rather than open-coding this destination address logic, make helper functions for it which will enable some later cleanups. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tap.c | 29 ++++++++++++++++++++++++----- tap.h | 3 +++ 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/tap.c b/tap.c index de02c56..41e8ff2 100644 --- a/tap.c +++ b/tap.c @@ -96,6 +96,28 @@ int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre) return write(c->fd_tap, (char *)data + (vnet_pre ? 4 : 0), len); } +/** + * tap_ip4_daddr() - Normal IPv4 destination address for inbound packets + * @c: Execution contextGiven that the address is returned in network order, I think this would be relevant here: * Return: IPv4 address, network order+ */ +in_addr_t tap_ip4_daddr(const struct ctx *c) +{ + return c->ip4.addr_seen; +} + +/** + * tap_ip6_daddr() - Normal IPv4 destination address for inbound packets + * @c: Execution context + * @src: Source address* Return: pointer to IPv6 address+ */ +const struct in6_addr *tap_ip6_daddr(const struct ctx *c, + const struct in6_addr *src) +{ + if (IN6_IS_ADDR_LINKLOCAL(src)) + return &c->ip6.addr_ll_seen; + return &c->ip6.addr_seen; +} + /** * tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums * @c: Execution context @@ -132,7 +154,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, iph->frag_off = 0; iph->ttl = 255; iph->protocol = proto; - iph->daddr = c->ip4.addr_seen; + iph->daddr = tap_ip4_daddr(c); memcpy(&iph->saddr, &src->s6_addr[12], 4); csum_ip4_header(iph); @@ -163,10 +185,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, ip6h->priority = 0; ip6h->saddr = *src; - if (IN6_IS_ADDR_LINKLOCAL(src)) - ip6h->daddr = c->ip6.addr_ll_seen; - else - ip6h->daddr = c->ip6.addr_seen; + ip6h->daddr = *tap_ip6_daddr(c, src); memcpy(data, in, len); diff --git a/tap.h b/tap.h index df3aec0..a6764b4 100644 --- a/tap.h +++ b/tap.h @@ -6,6 +6,9 @@ #ifndef TAP_H #define TAP_H +in_addr_t tap_ip4_daddr(const struct ctx *c); +const struct in6_addr *tap_ip6_daddr(const struct ctx *c, + const struct in6_addr *src); void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, const char *in, size_t len, uint32_t flow); int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre);-- Stefano
On Tue, Oct 18, 2022 at 05:04:41AM +0200, Stefano Brivio wrote: 11;rgb:ffff/ffff/ffff> On Mon, 17 Oct 2022 19:57:59 +1100David Gibson <david(a)gibson.dropbear.id.au> wrote:Done.tap_ip_send() doesn't take a destination address, because it's specifically for inbound packets, and the IP addresses of the guest/namespace are already known to us. Rather than open-coding this destination address logic, make helper functions for it which will enable some later cleanups. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- tap.c | 29 ++++++++++++++++++++++++----- tap.h | 3 +++ 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/tap.c b/tap.c index de02c56..41e8ff2 100644 --- a/tap.c +++ b/tap.c @@ -96,6 +96,28 @@ int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre) return write(c->fd_tap, (char *)data + (vnet_pre ? 4 : 0), len); } +/** + * tap_ip4_daddr() - Normal IPv4 destination address for inbound packets + * @c: Execution contextGiven that the address is returned in network order, I think this would be relevant here: * Return: IPv4 address, network orderDone.+ */ +in_addr_t tap_ip4_daddr(const struct ctx *c) +{ + return c->ip4.addr_seen; +} + +/** + * tap_ip6_daddr() - Normal IPv4 destination address for inbound packets + * @c: Execution context + * @src: Source address* Return: pointer to IPv6 address-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson+ */ +const struct in6_addr *tap_ip6_daddr(const struct ctx *c, + const struct in6_addr *src) +{ + if (IN6_IS_ADDR_LINKLOCAL(src)) + return &c->ip6.addr_ll_seen; + return &c->ip6.addr_seen; +} + /** * tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums * @c: Execution context @@ -132,7 +154,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, iph->frag_off = 0; iph->ttl = 255; iph->protocol = proto; - iph->daddr = c->ip4.addr_seen; + iph->daddr = tap_ip4_daddr(c); memcpy(&iph->saddr, &src->s6_addr[12], 4); csum_ip4_header(iph); @@ -163,10 +185,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, ip6h->priority = 0; ip6h->saddr = *src; - if (IN6_IS_ADDR_LINKLOCAL(src)) - ip6h->daddr = c->ip6.addr_ll_seen; - else - ip6h->daddr = c->ip6.addr_seen; + ip6h->daddr = *tap_ip6_daddr(c, src); memcpy(data, in, len); diff --git a/tap.h b/tap.h index df3aec0..a6764b4 100644 --- a/tap.h +++ b/tap.h @@ -6,6 +6,9 @@ #ifndef TAP_H #define TAP_H +in_addr_t tap_ip4_daddr(const struct ctx *c); +const struct in6_addr *tap_ip6_daddr(const struct ctx *c, + const struct in6_addr *src); void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, const char *in, size_t len, uint32_t flow); int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre);
tap_ip_send() is never used for TCP packets, we're unlikely to use it for that in future, and the handling of TCP packets makes other cleanups unnecessarily awkward. Remove it. This is the only user of csum_tcp4(), so we can remove that as well. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- checksum.c | 34 ---------------------------------- checksum.h | 1 - tap.c | 11 ++--------- 3 files changed, 2 insertions(+), 44 deletions(-) diff --git a/checksum.c b/checksum.c index f25a96a..887cfe3 100644 --- a/checksum.c +++ b/checksum.c @@ -209,40 +209,6 @@ void csum_icmp6(struct icmp6hdr *icmp6hr, icmp6hr->icmp6_cksum = csum_unaligned(payload, len, psum); } -/** - * csum_tcp4() - Calculate TCP checksum for IPv4 and set in place - * @iph: Packet buffer, IP header - */ -void csum_tcp4(struct iphdr *iph) -{ - uint16_t tlen = ntohs(iph->tot_len) - iph->ihl * 4, *p; - struct tcphdr *th; - uint32_t sum = 0; - - th = (struct tcphdr *)((char *)iph + (intptr_t)(iph->ihl * 4)); - p = (uint16_t *)th; - - sum += (iph->saddr >> 16) & 0xffff; - sum += iph->saddr & 0xffff; - sum += (iph->daddr >> 16) & 0xffff; - sum += iph->daddr & 0xffff; - - sum += htons(IPPROTO_TCP); - sum += htons(tlen); - - th->check = 0; - while (tlen > 1) { - sum += *p++; - tlen -= 2; - } - - if (tlen > 0) { - sum += *p & htons(0xff00); - } - - th->check = (uint16_t)~csum_fold(sum); -} - #ifdef __AVX2__ #include <immintrin.h> diff --git a/checksum.h b/checksum.h index bdb2ed2..eb3640e 100644 --- a/checksum.h +++ b/checksum.h @@ -27,7 +27,6 @@ void csum_icmp6(struct icmp6hdr *ih, const struct in6_addr *daddr, const void *payload, size_t len); -void csum_tcp4(struct iphdr *iph); uint16_t csum(const void *buf, size_t len, uint32_t init); #endif /* CHECKSUM_H */ diff --git a/tap.c b/tap.c index 41e8ff2..3ad5d7c 100644 --- a/tap.c +++ b/tap.c @@ -161,9 +161,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, memcpy(data, in, len); - if (iph->protocol == IPPROTO_TCP) { - csum_tcp4(iph); - } else if (iph->protocol == IPPROTO_UDP) { + if (iph->protocol == IPPROTO_UDP) { struct udphdr *uh = (struct udphdr *)(iph + 1); csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh)); @@ -192,13 +190,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, ip6h->hop_limit = proto; ip6h->version = 0; ip6h->nexthdr = 0; - if (proto == IPPROTO_TCP) { - struct tcphdr *th = (struct tcphdr *)(ip6h + 1); - th->check = 0; - th->check = csum_unaligned(ip6h, len + sizeof(*ip6h), - 0); - } else if (proto == IPPROTO_UDP) { + if (proto == IPPROTO_UDP) { struct udphdr *uh = (struct udphdr *)(ip6h + 1); csum_udp6(uh, &ip6h->saddr, &ip6h->daddr, -- 2.37.3
Callers of tap_send() can optionally use a small optimization by adding extra space for the 4 byte length header used on the qemu socket interface. tap_ip_send() is currently the only user of this, but this is used only for "slow path" ICMP and DHCP packets, so there's not a lot of value to the optimization. Worse, having the two paths here complicates the interface and makes future cleanups difficult, so just remove it. I have some plans to bring back the optimization in a more general way in future, but for now it's just in the way. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- arp.c | 2 +- dhcp.c | 2 +- ndp.c | 2 +- tap.c | 29 +++++++++-------------------- tap.h | 2 +- 5 files changed, 13 insertions(+), 24 deletions(-) diff --git a/arp.c b/arp.c index 0ad97af..141d43f 100644 --- a/arp.c +++ b/arp.c @@ -81,7 +81,7 @@ int arp(const struct ctx *c, const struct pool *p) memcpy(eh->h_dest, eh->h_source, sizeof(eh->h_dest)); memcpy(eh->h_source, c->mac, sizeof(eh->h_source)); - if (tap_send(c, eh, len, 0) < 0) + if (tap_send(c, eh, len) < 0) perror("ARP: send"); return 1; diff --git a/dhcp.c b/dhcp.c index 875e18b..2b3af82 100644 --- a/dhcp.c +++ b/dhcp.c @@ -377,7 +377,7 @@ int dhcp(const struct ctx *c, const struct pool *p) memcpy(eh->h_dest, eh->h_source, ETH_ALEN); memcpy(eh->h_source, c->mac, ETH_ALEN); - if (tap_send(c, eh, len, 0) < 0) + if (tap_send(c, eh, len) < 0) perror("DHCP: send"); return 1; diff --git a/ndp.c b/ndp.c index 03f1d06..79be0cf 100644 --- a/ndp.c +++ b/ndp.c @@ -200,7 +200,7 @@ dns_done: memcpy(ehr->h_source, c->mac, ETH_ALEN); ehr->h_proto = htons(ETH_P_IPV6); - if (tap_send(c, ehr, len, 0) < 0) + if (tap_send(c, ehr, len) < 0) perror("NDP: send"); return 1; diff --git a/tap.c b/tap.c index 3ad5d7c..ae75fac 100644 --- a/tap.c +++ b/tap.c @@ -66,34 +66,24 @@ static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf); * @c: Execution context * @data: Packet buffer * @len: Total L2 packet length - * @vnet_pre: Buffer has four-byte headroom * * Return: return code from send() or write() */ -int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre) +int tap_send(const struct ctx *c, const void *data, size_t len) { - if (vnet_pre) - pcap((char *)data + 4, len); - else - pcap(data, len); + pcap(data, len); if (c->mode == MODE_PASST) { int flags = MSG_NOSIGNAL | MSG_DONTWAIT; + uint32_t vnet_len = htonl(len); - if (vnet_pre) { - *((uint32_t *)data) = htonl(len); - len += 4; - } else { - uint32_t vnet_len = htonl(len); - - if (send(c->fd_tap, &vnet_len, 4, flags) < 0) - return -1; - } + if (send(c->fd_tap, &vnet_len, 4, flags) < 0) + return -1; return send(c->fd_tap, data, len, flags); } - return write(c->fd_tap, (char *)data + (vnet_pre ? 4 : 0), len); + return write(c->fd_tap, (char *)data, len); } /** @@ -131,10 +121,9 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, const char *in, size_t len, uint32_t flow) { char buf[USHRT_MAX]; - char *pkt = buf + 4; struct ethhdr *eh; - eh = (struct ethhdr *)pkt; + eh = (struct ethhdr *)buf; /* TODO: ARP table lookup */ memcpy(eh->h_dest, c->mac_guest, ETH_ALEN); @@ -170,7 +159,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, csum_icmp4(ih, ih + 1, len - sizeof(*ih)); } - if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh), 1) < 0) + if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh)) < 0) debug("tap: failed to send %lu bytes (IPv4)", len); } else { struct ipv6hdr *ip6h = (struct ipv6hdr *)(eh + 1); @@ -211,7 +200,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, ip6h->flow_lbl[2] = (flow >> 0) & 0xff; } - if (tap_send(c, buf, len + sizeof(*ip6h) + sizeof(*eh), 1) < 1) + if (tap_send(c, buf, len + sizeof(*ip6h) + sizeof(*eh)) < 1) debug("tap: failed to send %lu bytes (IPv6)", len); } } diff --git a/tap.h b/tap.h index a6764b4..a8da8bb 100644 --- a/tap.h +++ b/tap.h @@ -11,7 +11,7 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c, const struct in6_addr *src); void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, const char *in, size_t len, uint32_t flow); -int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre); +int tap_send(const struct ctx *c, const void *data, size_t len); void tap_handler(struct ctx *c, int fd, uint32_t events, const struct timespec *now); void tap_sock_init(struct ctx *c); -- 2.37.3
On Mon, 17 Oct 2022 19:58:01 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Callers of tap_send() can optionally use a small optimization by adding extra space for the 4 byte length header used on the qemu socket interface. tap_ip_send() is currently the only user of this, but this is used only for "slow path" ICMP and DHCP packets, so there's not a lot of value to the optimization.Not anymore, definitely. It used to look quite bad in perf(1) when I shuffled connection data around. -- Stefano
On Tue, Oct 18, 2022 at 05:05:13AM +0200, Stefano Brivio wrote:On Mon, 17 Oct 2022 19:58:01 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Yeah. In any case I think I can get an equivalent optimization back at some point. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonCallers of tap_send() can optionally use a small optimization by adding extra space for the 4 byte length header used on the qemu socket interface. tap_ip_send() is currently the only user of this, but this is used only for "slow path" ICMP and DHCP packets, so there's not a lot of value to the optimization.Not anymore, definitely. It used to look quite bad in perf(1) when I shuffled connection data around.
The IPv4 and IPv6 paths in tap_ip_send() have very little in common, and it turns out that every caller (statically) knows if it is using IPv4 or IPv6. So split into separate tap_ip4_send() and tap_ip6_send() functions. Use a new tap_l2_hdr() function for the very small common part. While we're there, make some minor cleanups: - We were double writing some fields in the IPv6 header, so that it temporary matched the pseudo-header for checksum calculation. With recent checksum reworks, this isn't neccessary any more. - We don't use any IPv4 header options, so use some sizeof() constructs instead of some open coded values for header length. - The comment used to say that the flow label was for TCP over IPv6, but in fact the only thing we used it for was ICMPv6 Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- dhcpv6.c | 6 +- icmp.c | 10 +--- tap.c | 176 +++++++++++++++++++++++++++++-------------------------- tap.h | 6 +- 4 files changed, 102 insertions(+), 96 deletions(-) diff --git a/dhcpv6.c b/dhcpv6.c index e7640ce..7829968 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -531,8 +531,8 @@ int dhcpv6(struct ctx *c, const struct pool *p, resp_not_on_link.hdr.xid = mh->xid; - tap_ip_send(c, src, IPPROTO_UDP, - (char *)&resp_not_on_link, n, mh->xid); + tap_ip6_send(c, src, IPPROTO_UDP, + (char *)&resp_not_on_link, n, mh->xid); return 1; } @@ -580,7 +580,7 @@ int dhcpv6(struct ctx *c, const struct pool *p, resp.hdr.xid = mh->xid; - tap_ip_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid); + tap_ip6_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid); c->ip6.addr_seen = c->ip6.addr; return 1; diff --git a/icmp.c b/icmp.c index 21ea2d7..61c2d90 100644 --- a/icmp.c +++ b/icmp.c @@ -69,10 +69,6 @@ static uint8_t icmp_act[IP_VERSIONS][DIV_ROUND_UP(ICMP_NUM_IDS, 8)]; void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events, const struct timespec *now) { - struct in6_addr a6 = { .s6_addr = { 0, 0, 0, 0, - 0, 0, 0, 0, - 0, 0, 0xff, 0xff, - 0, 0, 0, 0 } }; union icmp_epoll_ref *iref = &ref.r.p.icmp; struct sockaddr_storage sr; socklen_t sl = sizeof(sr); @@ -109,7 +105,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V6][id].seq = seq; } - tap_ip_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0); + tap_ip6_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0); } else { struct sockaddr_in *sr4 = (struct sockaddr_in *)&sr; struct icmphdr *ih = (struct icmphdr *)buf; @@ -127,9 +123,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - memcpy(&a6.s6_addr[12], &sr4->sin_addr, sizeof(sr4->sin_addr)); - - tap_ip_send(c, &a6, IPPROTO_ICMP, buf, n, 0); + tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); } } diff --git a/tap.c b/tap.c index ae75fac..45547ac 100644 --- a/tap.c +++ b/tap.c @@ -109,100 +109,110 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c, } /** - * tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums + * tap_l2_hdr() - Build an L2 header for an inbound packet * @c: Execution context - * @src: IPv6 source address, IPv4-mapped for IPv4 sources - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - * @flow: Flow label for TCP over IPv6 + * @buf: Buffer address at which to generate header + * @proto: Ethernet protocol number for L3 + * + * Returns a pointer at which to write the packet's payload */ -void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, - const char *in, size_t len, uint32_t flow) +static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto) { - char buf[USHRT_MAX]; - struct ethhdr *eh; - - eh = (struct ethhdr *)buf; + struct ethhdr *eh = (struct ethhdr *)buf; /* TODO: ARP table lookup */ memcpy(eh->h_dest, c->mac_guest, ETH_ALEN); memcpy(eh->h_source, c->mac, ETH_ALEN); + eh->h_proto = ntohs(proto); + return eh + 1; +} - if (IN6_IS_ADDR_V4MAPPED(src)) { - struct iphdr *iph = (struct iphdr *)(eh + 1); - char *data = (char *)(iph + 1); - - eh->h_proto = ntohs(ETH_P_IP); - - iph->version = 4; - iph->ihl = 5; - iph->tos = 0; - iph->tot_len = htons(len + 20); - iph->id = 0; - iph->frag_off = 0; - iph->ttl = 255; - iph->protocol = proto; - iph->daddr = tap_ip4_daddr(c); - memcpy(&iph->saddr, &src->s6_addr[12], 4); - - csum_ip4_header(iph); - - memcpy(data, in, len); - - if (iph->protocol == IPPROTO_UDP) { - struct udphdr *uh = (struct udphdr *)(iph + 1); - - csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh)); - } else if (iph->protocol == IPPROTO_ICMP) { - struct icmphdr *ih = (struct icmphdr *)(iph + 1); - csum_icmp4(ih, ih + 1, len - sizeof(*ih)); - } - - if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh)) < 0) - debug("tap: failed to send %lu bytes (IPv4)", len); - } else { - struct ipv6hdr *ip6h = (struct ipv6hdr *)(eh + 1); - char *data = (char *)(ip6h + 1); - - eh->h_proto = ntohs(ETH_P_IPV6); - - memset(ip6h->flow_lbl, 0, 3); - ip6h->payload_len = htons(len); - ip6h->priority = 0; - - ip6h->saddr = *src; - ip6h->daddr = *tap_ip6_daddr(c, src); - - memcpy(data, in, len); - - ip6h->hop_limit = proto; - ip6h->version = 0; - ip6h->nexthdr = 0; - - if (proto == IPPROTO_UDP) { - struct udphdr *uh = (struct udphdr *)(ip6h + 1); - - csum_udp6(uh, &ip6h->saddr, &ip6h->daddr, - uh + 1, len - sizeof(*uh)); - } else if (proto == IPPROTO_ICMPV6) { - struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1); +/** + * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums + * @c: Execution context + * @src: IPv4 source address + * @proto: L4 protocol number + * @in: Payload + * @len: L4 payload length + */ +void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, + const char *in, size_t len) +{ + char buf[USHRT_MAX]; + struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP); + char *data = (char *)(ip4h + 1); + + ip4h->version = 4; + ip4h->ihl = sizeof(struct iphdr) / 4; + ip4h->tos = 0; + ip4h->tot_len = htons(len + sizeof(*ip4h)); + ip4h->id = 0; + ip4h->frag_off = 0; + ip4h->ttl = 255; + ip4h->protocol = proto; + ip4h->saddr = src; + ip4h->daddr = tap_ip4_daddr(c); + csum_ip4_header(ip4h); + + memcpy(data, in, len); + + if (ip4h->protocol == IPPROTO_UDP) { + struct udphdr *uh = (struct udphdr *)(ip4h + 1); + + csum_udp4(uh, ip4h->saddr, ip4h->daddr, + uh + 1, len - sizeof(*uh)); + } else if (ip4h->protocol == IPPROTO_ICMP) { + struct icmphdr *ih = (struct icmphdr *)(ip4h + 1); + csum_icmp4(ih, ih + 1, len - sizeof(*ih)); + } - csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr, - ih + 1, len - sizeof(*ih)); - } - ip6h->version = 6; - ip6h->nexthdr = proto; - ip6h->hop_limit = 255; - if (flow) { - ip6h->flow_lbl[0] = (flow >> 16) & 0xf; - ip6h->flow_lbl[1] = (flow >> 8) & 0xff; - ip6h->flow_lbl[2] = (flow >> 0) & 0xff; - } + if (tap_send(c, buf, len + (data - buf)) < 0) + debug("tap: failed to send %lu bytes (IPv4)", len); +} - if (tap_send(c, buf, len + sizeof(*ip6h) + sizeof(*eh)) < 1) - debug("tap: failed to send %lu bytes (IPv6)", len); +/** + * tap_ip6_send() - Send IPv6 packet, with L2 headers, calculating L3/L4 checksums + * @c: Execution context + * @src: IPv6 source address + * @proto: L4 protocol number + * @in: Payload + * @len: L4 payload length + * @flow: Flow label + */ +void tap_ip6_send(const struct ctx *c, const struct in6_addr *src, + uint8_t proto, const char *in, size_t len, uint32_t flow) +{ + char buf[USHRT_MAX]; + struct ipv6hdr *ip6h = (struct ipv6hdr *)tap_l2_hdr(c, buf, ETH_P_IPV6); + char *data = (char *)(ip6h + 1); + + ip6h->payload_len = htons(len); + ip6h->priority = 0; + ip6h->version = 6; + ip6h->nexthdr = proto; + ip6h->hop_limit = 255; + ip6h->saddr = *src; + ip6h->daddr = *tap_ip6_daddr(c, src); + ip6h->flow_lbl[0] = (flow >> 16) & 0xf; + ip6h->flow_lbl[1] = (flow >> 8) & 0xff; + ip6h->flow_lbl[2] = (flow >> 0) & 0xff; + + memcpy(data, in, len); + + if (proto == IPPROTO_UDP) { + struct udphdr *uh = (struct udphdr *)(ip6h + 1); + + csum_udp6(uh, &ip6h->saddr, &ip6h->daddr, + uh + 1, len - sizeof(*uh)); + } else if (proto == IPPROTO_ICMPV6) { + struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1); + + csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr, + ih + 1, len - sizeof(*ih)); } + + if (tap_send(c, buf, len + (data - buf)) < 1) + debug("tap: failed to send %lu bytes (IPv6)", len); } PACKET_POOL_DECL(pool_l4, UIO_MAXIOV, pkt_buf); diff --git a/tap.h b/tap.h index a8da8bb..011ba8e 100644 --- a/tap.h +++ b/tap.h @@ -9,8 +9,10 @@ in_addr_t tap_ip4_daddr(const struct ctx *c); const struct in6_addr *tap_ip6_daddr(const struct ctx *c, const struct in6_addr *src); -void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto, - const char *in, size_t len, uint32_t flow); +void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, + const char *in, size_t len); +void tap_ip6_send(const struct ctx *c, const struct in6_addr *src, + uint8_t proto, const char *in, size_t len, uint32_t flow); int tap_send(const struct ctx *c, const void *data, size_t len); void tap_handler(struct ctx *c, int fd, uint32_t events, const struct timespec *now); -- 2.37.3
On Mon, 17 Oct 2022 19:58:02 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:The IPv4 and IPv6 paths in tap_ip_send() have very little in common, and it turns out that every caller (statically) knows if it is using IPv4 or IPv6. So split into separate tap_ip4_send() and tap_ip6_send() functions. Use a new tap_l2_hdr() function for the very small common part. While we're there, make some minor cleanups: - We were double writing some fields in the IPv6 header, so that it temporary matched the pseudo-header for checksum calculation. With recent checksum reworks, this isn't neccessary any more. - We don't use any IPv4 header options, so use some sizeof() constructs instead of some open coded values for header length. - The comment used to say that the flow label was for TCP over IPv6, but in fact the only thing we used it for was ICMPv6...right, this used to be the data path.Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- dhcpv6.c | 6 +- icmp.c | 10 +--- tap.c | 176 +++++++++++++++++++++++++++++-------------------------- tap.h | 6 +- 4 files changed, 102 insertions(+), 96 deletions(-) diff --git a/dhcpv6.c b/dhcpv6.c index e7640ce..7829968 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -531,8 +531,8 @@ int dhcpv6(struct ctx *c, const struct pool *p, resp_not_on_link.hdr.xid = mh->xid; - tap_ip_send(c, src, IPPROTO_UDP, - (char *)&resp_not_on_link, n, mh->xid); + tap_ip6_send(c, src, IPPROTO_UDP, + (char *)&resp_not_on_link, n, mh->xid); return 1; } @@ -580,7 +580,7 @@ int dhcpv6(struct ctx *c, const struct pool *p, resp.hdr.xid = mh->xid; - tap_ip_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid); + tap_ip6_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid); c->ip6.addr_seen = c->ip6.addr; return 1; diff --git a/icmp.c b/icmp.c index 21ea2d7..61c2d90 100644 --- a/icmp.c +++ b/icmp.c @@ -69,10 +69,6 @@ static uint8_t icmp_act[IP_VERSIONS][DIV_ROUND_UP(ICMP_NUM_IDS, 8)]; void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events, const struct timespec *now) { - struct in6_addr a6 = { .s6_addr = { 0, 0, 0, 0, - 0, 0, 0, 0, - 0, 0, 0xff, 0xff, - 0, 0, 0, 0 } }; union icmp_epoll_ref *iref = &ref.r.p.icmp; struct sockaddr_storage sr; socklen_t sl = sizeof(sr); @@ -109,7 +105,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V6][id].seq = seq; } - tap_ip_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0); + tap_ip6_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0); } else { struct sockaddr_in *sr4 = (struct sockaddr_in *)&sr; struct icmphdr *ih = (struct icmphdr *)buf; @@ -127,9 +123,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - memcpy(&a6.s6_addr[12], &sr4->sin_addr, sizeof(sr4->sin_addr)); - - tap_ip_send(c, &a6, IPPROTO_ICMP, buf, n, 0); + tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); } } diff --git a/tap.c b/tap.c index ae75fac..45547ac 100644 --- a/tap.c +++ b/tap.c @@ -109,100 +109,110 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c, } /** - * tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums + * tap_l2_hdr() - Build an L2 header for an inbound packet * @c: Execution context - * @src: IPv6 source address, IPv4-mapped for IPv4 sources - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - * @flow: Flow label for TCP over IPv6 + * @buf: Buffer address at which to generate header + * @proto: Ethernet protocol number for L3 + * + * Returns a pointer at which to write the packet's payload* Return: ... -- Stefano
On Tue, Oct 18, 2022 at 05:06:11AM +0200, Stefano Brivio wrote:On Mon, 17 Oct 2022 19:58:02 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Makes sense. Realized that comment isn't quite correct, because it was DHCPv6 rather than ICMPv6 traffic that got the flow labels.The IPv4 and IPv6 paths in tap_ip_send() have very little in common, and it turns out that every caller (statically) knows if it is using IPv4 or IPv6. So split into separate tap_ip4_send() and tap_ip6_send() functions. Use a new tap_l2_hdr() function for the very small common part. While we're there, make some minor cleanups: - We were double writing some fields in the IPv6 header, so that it temporary matched the pseudo-header for checksum calculation. With recent checksum reworks, this isn't neccessary any more. - We don't use any IPv4 header options, so use some sizeof() constructs instead of some open coded values for header length. - The comment used to say that the flow label was for TCP over IPv6, but in fact the only thing we used it for was ICMPv6...right, this used to be the data path.Done. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonSigned-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- dhcpv6.c | 6 +- icmp.c | 10 +--- tap.c | 176 +++++++++++++++++++++++++++++-------------------------- tap.h | 6 +- 4 files changed, 102 insertions(+), 96 deletions(-) diff --git a/dhcpv6.c b/dhcpv6.c index e7640ce..7829968 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -531,8 +531,8 @@ int dhcpv6(struct ctx *c, const struct pool *p, resp_not_on_link.hdr.xid = mh->xid; - tap_ip_send(c, src, IPPROTO_UDP, - (char *)&resp_not_on_link, n, mh->xid); + tap_ip6_send(c, src, IPPROTO_UDP, + (char *)&resp_not_on_link, n, mh->xid); return 1; } @@ -580,7 +580,7 @@ int dhcpv6(struct ctx *c, const struct pool *p, resp.hdr.xid = mh->xid; - tap_ip_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid); + tap_ip6_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid); c->ip6.addr_seen = c->ip6.addr; return 1; diff --git a/icmp.c b/icmp.c index 21ea2d7..61c2d90 100644 --- a/icmp.c +++ b/icmp.c @@ -69,10 +69,6 @@ static uint8_t icmp_act[IP_VERSIONS][DIV_ROUND_UP(ICMP_NUM_IDS, 8)]; void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, uint32_t events, const struct timespec *now) { - struct in6_addr a6 = { .s6_addr = { 0, 0, 0, 0, - 0, 0, 0, 0, - 0, 0, 0xff, 0xff, - 0, 0, 0, 0 } }; union icmp_epoll_ref *iref = &ref.r.p.icmp; struct sockaddr_storage sr; socklen_t sl = sizeof(sr); @@ -109,7 +105,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V6][id].seq = seq; } - tap_ip_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0); + tap_ip6_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0); } else { struct sockaddr_in *sr4 = (struct sockaddr_in *)&sr; struct icmphdr *ih = (struct icmphdr *)buf; @@ -127,9 +123,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - memcpy(&a6.s6_addr[12], &sr4->sin_addr, sizeof(sr4->sin_addr)); - - tap_ip_send(c, &a6, IPPROTO_ICMP, buf, n, 0); + tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); } } diff --git a/tap.c b/tap.c index ae75fac..45547ac 100644 --- a/tap.c +++ b/tap.c @@ -109,100 +109,110 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c, } /** - * tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums + * tap_l2_hdr() - Build an L2 header for an inbound packet * @c: Execution context - * @src: IPv6 source address, IPv4-mapped for IPv4 sources - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - * @flow: Flow label for TCP over IPv6 + * @buf: Buffer address at which to generate header + * @proto: Ethernet protocol number for L3 + * + * Returns a pointer at which to write the packet's payload* Return: ...
tap_ip6_send() has special case logic to compute the checksums for UDP and ICMP packets, which is a mild layering violation. By using a suitable helper we can split it into tap_udp6_send() and tap_icmp6_send() functions without greatly increasing the code size, this removing that layering violation. We make some small changes to the interface while there. In both cases we make the destination IPv6 address a parameter, which will be useful later. For the UDP variant we make it take just the UDP payload, and it will generate the UDP header. For the ICMP variant we pass in the ICMP header as before. The inconsistency is because that's what seems to be the more natural way to invoke the function in the callers in each case. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- dhcpv6.c | 21 +++------------ icmp.c | 3 ++- tap.c | 79 +++++++++++++++++++++++++++++++++++++++----------------- tap.h | 9 +++++-- 4 files changed, 68 insertions(+), 44 deletions(-) diff --git a/dhcpv6.c b/dhcpv6.c index 7829968..e763aed 100644 --- a/dhcpv6.c +++ b/dhcpv6.c @@ -208,15 +208,8 @@ struct msg_hdr { uint32_t xid:24; } __attribute__((__packed__)); -#if __BYTE_ORDER == __BIG_ENDIAN -#define UH_RESP {{{ 547, 546, 0, 0, }}} -#else -#define UH_RESP {{{ __bswap_constant_16(547), __bswap_constant_16(546), 0, 0 }}} -#endif - /** * struct resp_t - Normal advertise and reply message - * @uh: UDP header * @hdr: DHCP message header * @server_id: Server Identifier option * @ia_na: Non-temporary Address option @@ -226,7 +219,6 @@ struct msg_hdr { * @dns_search: Domain Search List, here just for storage size */ static struct resp_t { - struct udphdr uh; struct msg_hdr hdr; struct opt_server_id server_id; @@ -236,7 +228,6 @@ static struct resp_t { struct opt_dns_servers dns_servers; struct opt_dns_search dns_search; } __attribute__((__packed__)) resp = { - UH_RESP, { 0 }, SERVER_ID, @@ -270,13 +261,11 @@ static const struct opt_status_code sc_not_on_link = { /** * struct resp_not_on_link_t - NotOnLink error (mandated by RFC 8415, 18.3.2.) - * @uh: UDP header * @hdr: DHCP message header * @server_id: Server Identifier option * @var: Payload: IA_NA from client, status code, client ID */ static struct resp_not_on_link_t { - struct udphdr uh; struct msg_hdr hdr; struct opt_server_id server_id; @@ -284,7 +273,6 @@ static struct resp_not_on_link_t { uint8_t var[sizeof(struct opt_ia_na) + sizeof(struct opt_status_code) + sizeof(struct opt_client_id)]; } __attribute__((__packed__)) resp_not_on_link = { - UH_RESP, { TYPE_REPLY, 0 }, SERVER_ID, { 0, }, @@ -527,12 +515,11 @@ int dhcpv6(struct ctx *c, const struct pool *p, n += sizeof(struct opt_hdr) + ntohs(client_id->l); n = offsetof(struct resp_not_on_link_t, var) + n; - resp_not_on_link.uh.len = htons(n); resp_not_on_link.hdr.xid = mh->xid; - tap_ip6_send(c, src, IPPROTO_UDP, - (char *)&resp_not_on_link, n, mh->xid); + tap_udp6_send(c, src, 547, tap_ip6_daddr(c, src), 546, + mh->xid, &resp_not_on_link, n); return 1; } @@ -576,11 +563,11 @@ int dhcpv6(struct ctx *c, const struct pool *p, n = offsetof(struct resp_t, client_id) + sizeof(struct opt_hdr) + ntohs(client_id->l); n = dhcpv6_dns_fill(c, (char *)&resp, n); - resp.uh.len = htons(n); resp.hdr.xid = mh->xid; - tap_ip6_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid); + tap_udp6_send(c, src, 547, tap_ip6_daddr(c, src), 546, + mh->xid, &resp, n); c->ip6.addr_seen = c->ip6.addr; return 1; diff --git a/icmp.c b/icmp.c index 61c2d90..6493ea9 100644 --- a/icmp.c +++ b/icmp.c @@ -105,7 +105,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V6][id].seq = seq; } - tap_ip6_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0); + tap_icmp6_send(c, &sr6->sin6_addr, + tap_ip6_daddr(c, &sr6->sin6_addr), buf, n); } else { struct sockaddr_in *sr4 = (struct sockaddr_in *)&sr; struct icmphdr *ih = (struct icmphdr *)buf; diff --git a/tap.c b/tap.c index 45547ac..b0c1481 100644 --- a/tap.c +++ b/tap.c @@ -170,21 +170,11 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, debug("tap: failed to send %lu bytes (IPv4)", len); } -/** - * tap_ip6_send() - Send IPv6 packet, with L2 headers, calculating L3/L4 checksums - * @c: Execution context - * @src: IPv6 source address - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - * @flow: Flow label - */ -void tap_ip6_send(const struct ctx *c, const struct in6_addr *src, - uint8_t proto, const char *in, size_t len, uint32_t flow) +static void *tap_ip6_hdr(char *buf, + const struct in6_addr *src, const struct in6_addr *dst, + size_t len, uint8_t proto, uint32_t flow) { - char buf[USHRT_MAX]; - struct ipv6hdr *ip6h = (struct ipv6hdr *)tap_l2_hdr(c, buf, ETH_P_IPV6); - char *data = (char *)(ip6h + 1); + struct ipv6hdr *ip6h = (struct ipv6hdr *)buf; ip6h->payload_len = htons(len); ip6h->priority = 0; @@ -192,24 +182,65 @@ void tap_ip6_send(const struct ctx *c, const struct in6_addr *src, ip6h->nexthdr = proto; ip6h->hop_limit = 255; ip6h->saddr = *src; - ip6h->daddr = *tap_ip6_daddr(c, src); + ip6h->daddr = *dst; ip6h->flow_lbl[0] = (flow >> 16) & 0xf; ip6h->flow_lbl[1] = (flow >> 8) & 0xff; ip6h->flow_lbl[2] = (flow >> 0) & 0xff; + return ip6h + 1; +} +/** + * tap_udp6_send() - Send UDP over IPv6 packet + * @c: Execution context + * @src: IPv6 source address + * @sport: UDP source port + * @dst: IPv6 destination address + * @dport: UDP destination port + * @flow: Flow label + * @in: UDP payload contents (not including UDP header) + * @len: UDP payload length (not including UDP header) + */ +void tap_udp6_send(const struct ctx *c, + const struct in6_addr *src, in_port_t sport, + const struct in6_addr *dst, in_port_t dport, + uint32_t flow, const void *in, size_t len) +{ + size_t udplen = len + sizeof(struct udphdr); + char buf[USHRT_MAX]; + void *ip6h = tap_l2_hdr(c, buf, ETH_P_IPV6); + void *uhp = tap_ip6_hdr(ip6h, src, dst, udplen, IPPROTO_UDP, flow); + struct udphdr *uh = (struct udphdr *)uhp; + char *data = (char *)(uh + 1); + + uh->source = htons(sport); + uh->dest = htons(dport); + uh->len = htons(udplen); + csum_udp6(uh, src, dst, in, len); memcpy(data, in, len); - if (proto == IPPROTO_UDP) { - struct udphdr *uh = (struct udphdr *)(ip6h + 1); + if (tap_send(c, buf, len + (data - buf)) < 1) + debug("tap: failed to send %lu bytes (IPv6)", len); +} - csum_udp6(uh, &ip6h->saddr, &ip6h->daddr, - uh + 1, len - sizeof(*uh)); - } else if (proto == IPPROTO_ICMPV6) { - struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1); +/** + * tap_icmp6_send() - Send ICMPv6 packet + * @c: Execution context + * @src: IPv6 source address + * @dst: IPv6 destination address + * @in: ICMP packet, including ICMP header + * @len: ICMP packet length, including ICMP header + */ +void tap_icmp6_send(const struct ctx *c, + const struct in6_addr *src, const struct in6_addr *dst, + void *in, size_t len) +{ + char buf[USHRT_MAX]; + void *ip6h = tap_l2_hdr(c, buf, ETH_P_IPV6); + char *data = tap_ip6_hdr(ip6h, src, dst, len, IPPROTO_ICMPV6, 0); + struct icmp6hdr *icmp6h = (struct icmp6hdr *)data; - csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr, - ih + 1, len - sizeof(*ih)); - } + memcpy(data, in, len); + csum_icmp6(icmp6h, src, dst, icmp6h + 1, len - sizeof(*icmp6h)); if (tap_send(c, buf, len + (data - buf)) < 1) debug("tap: failed to send %lu bytes (IPv6)", len); diff --git a/tap.h b/tap.h index 011ba8e..d43c7a0 100644 --- a/tap.h +++ b/tap.h @@ -11,8 +11,13 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c, const struct in6_addr *src); void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, const char *in, size_t len); -void tap_ip6_send(const struct ctx *c, const struct in6_addr *src, - uint8_t proto, const char *in, size_t len, uint32_t flow); +void tap_udp6_send(const struct ctx *c, + const struct in6_addr *src, in_port_t sport, + const struct in6_addr *dst, in_port_t dport, + uint32_t flow, const void *in, size_t len); +void tap_icmp6_send(const struct ctx *c, + const struct in6_addr *src, const struct in6_addr *dst, + void *in, size_t len); int tap_send(const struct ctx *c, const void *data, size_t len); void tap_handler(struct ctx *c, int fd, uint32_t events, const struct timespec *now); -- 2.37.3
ndp() takes a parameter giving the ethernet source address of the packet it is to respond to, which it uses to determine the destination address to send the reply packet to. This is not necessary, because the address will always be the guest's MAC address. Even if the guest has just changed MAC address, then either tap_handler_passt() or tap_handler_pasta() - which are the only call paths leading to ndp() will have updated c->mac_guest with the new value. So, remove the parameter, and just use c->mac_guest, making it more consistent with other paths where we construct packets to send inwards. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- ndp.c | 6 ++---- ndp.h | 3 +-- tap.c | 2 +- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/ndp.c b/ndp.c index 79be0cf..f96b4b7 100644 --- a/ndp.c +++ b/ndp.c @@ -41,13 +41,11 @@ * ndp() - Check for NDP solicitations, reply as needed * @c: Execution context * @ih: ICMPv6 header - * @eh_source: Source Ethernet address * @saddr Source IPv6 address * * Return: 0 if not handled here, 1 if handled, -1 on failure */ -int ndp(struct ctx *c, const struct icmp6hdr *ih, - const unsigned char *eh_source, const struct in6_addr *saddr) +int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) { char buf[BUFSIZ] = { 0 }; struct ipv6hdr *ip6hr; @@ -196,7 +194,7 @@ dns_done: ip6hr->hop_limit = 255; len += sizeof(*ehr) + sizeof(*ip6hr) + sizeof(*ihr); - memcpy(ehr->h_dest, eh_source, ETH_ALEN); + memcpy(ehr->h_dest, c->mac_guest, ETH_ALEN); memcpy(ehr->h_source, c->mac, ETH_ALEN); ehr->h_proto = htons(ETH_P_IPV6); diff --git a/ndp.h b/ndp.h index d857425..b012747 100644 --- a/ndp.h +++ b/ndp.h @@ -6,7 +6,6 @@ #ifndef NDP_H #define NDP_H -int ndp(struct ctx *c, const struct icmp6hdr *ih, - const unsigned char *eh_source, const struct in6_addr *saddr); +int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr); #endif /* NDP_H */ diff --git a/tap.c b/tap.c index b0c1481..274f4ba 100644 --- a/tap.c +++ b/tap.c @@ -560,7 +560,7 @@ resume: if (l4_len < sizeof(struct icmp6hdr)) continue; - if (ndp(c, (struct icmp6hdr *)l4h, eh->h_source, saddr)) + if (ndp(c, (struct icmp6hdr *)l4h, saddr)) continue; tap_packet_debug(NULL, ip6h, NULL, proto, NULL, 1); -- 2.37.3
We send ICMPv6 packets to the guest from both icmp.c and from ndp.c. The case in ndp() manually constructs L2 and IPv6 headers, unlike the version in icmp.c which uses the tap_icmp6_send() helper from tap.c Now that we've broaded the parameters of tap_icmp6_send() we can use it in ndp() as well saving some duplicated logic. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- ndp.c | 21 ++++----------------- 1 file changed, 4 insertions(+), 17 deletions(-) diff --git a/ndp.c b/ndp.c index f96b4b7..80e1f19 100644 --- a/ndp.c +++ b/ndp.c @@ -47,6 +47,7 @@ */ int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr) { + const struct in6_addr *rsaddr; /* src addr for reply */ char buf[BUFSIZ] = { 0 }; struct ipv6hdr *ip6hr; struct icmp6hdr *ihr; @@ -180,26 +181,12 @@ dns_done: else c->ip6.addr_seen = *saddr; - ip6hr->daddr = *saddr; if (IN6_IS_ADDR_LINKLOCAL(&c->ip6.gw)) - ip6hr->saddr = c->ip6.gw; + rsaddr = &c->ip6.gw; else - ip6hr->saddr = c->ip6.addr_ll; + rsaddr = &c->ip6.addr_ll; - ip6hr->payload_len = htons(sizeof(*ihr) + len); - csum_icmp6(ihr, &ip6hr->saddr, &ip6hr->daddr, ihr + 1, len); - - ip6hr->version = 6; - ip6hr->nexthdr = IPPROTO_ICMPV6; - ip6hr->hop_limit = 255; - - len += sizeof(*ehr) + sizeof(*ip6hr) + sizeof(*ihr); - memcpy(ehr->h_dest, c->mac_guest, ETH_ALEN); - memcpy(ehr->h_source, c->mac, ETH_ALEN); - ehr->h_proto = htons(ETH_P_IPV6); - - if (tap_send(c, ehr, len) < 0) - perror("NDP: send"); + tap_icmp6_send(c, rsaddr, saddr, ihr, len + sizeof(*ihr)); return 1; } -- 2.37.3
tap_ip4_send() has special case logic to compute the checksums for UDP and ICMP packets, which is a mild layering violation. By using a suitable helper we can split it into tap_udp4_send() and tap_icmp4_send() functions without greatly increasing the code size, this removing that layering violation. We make some small changes to the interface while there. In both cases we make the destination IPv4 address a parameter, which will be useful later. For the UDP variant we make it take just the UDP payload, and it will generate the UDP header. For the ICMP variant we pass in the ICMP header as before. The inconsistency is because that's what seems to be the more natural way to invoke the function in the callers in each case. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- icmp.c | 3 ++- tap.c | 75 +++++++++++++++++++++++++++++++++++++++++----------------- tap.h | 7 ++++-- 3 files changed, 60 insertions(+), 25 deletions(-) diff --git a/icmp.c b/icmp.c index 6493ea9..233acf9 100644 --- a/icmp.c +++ b/icmp.c @@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), + buf, n); } } diff --git a/tap.c b/tap.c index 274f4ba..5792880 100644 --- a/tap.c +++ b/tap.c @@ -127,20 +127,10 @@ static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto) return eh + 1; } -/** - * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums - * @c: Execution context - * @src: IPv4 source address - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - */ -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len) +static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst, + size_t len, uint8_t proto) { - char buf[USHRT_MAX]; - struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP); - char *data = (char *)(ip4h + 1); + struct iphdr *ip4h = (struct iphdr *)buf; ip4h->version = 4; ip4h->ihl = sizeof(struct iphdr) / 4; @@ -151,20 +141,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, ip4h->ttl = 255; ip4h->protocol = proto; ip4h->saddr = src; - ip4h->daddr = tap_ip4_daddr(c); + ip4h->daddr = dst; csum_ip4_header(ip4h); + return ip4h + 1; +} + +/** + * tap_udp4_send() - Send UDP over IPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @sport: UDP source port + * @dst: IPv4 destination address + * @dport: UDP destination port + * @in: UDP payload contents (not including UDP header) + * @len: UDP payload length (not including UDP header) + */ +/* cppcheck-suppress unusedFunction */ +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len) +{ + size_t udplen = len + sizeof(struct udphdr); + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + void *uhp = tap_ip4_hdr(ip4h, src, dst, udplen, IPPROTO_UDP); + struct udphdr *uh = (struct udphdr *)uhp; + char *data = (char *)(uh + 1); + uh->source = htons(sport); + uh->dest = htons(dport); + uh->len = htons(udplen); + csum_udp4(uh, src, dst, in, len); memcpy(data, in, len); - if (ip4h->protocol == IPPROTO_UDP) { - struct udphdr *uh = (struct udphdr *)(ip4h + 1); + if (tap_send(c, buf, len + (data - buf)) < 0) + debug("tap: failed to send %lu bytes (IPv4)", len); +} - csum_udp4(uh, ip4h->saddr, ip4h->daddr, - uh + 1, len - sizeof(*uh)); - } else if (ip4h->protocol == IPPROTO_ICMP) { - struct icmphdr *ih = (struct icmphdr *)(ip4h + 1); - csum_icmp4(ih, ih + 1, len - sizeof(*ih)); - } +/** + * tap_icmp4_send() - Send ICMPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @dst: IPv4 destination address + * @in: ICMP packet, including ICMP header + * @len: ICMP packet length, including ICMP header + */ +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst, + void *in, size_t len) +{ + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + char *data = tap_ip4_hdr(ip4h, src, dst, len, IPPROTO_ICMP); + struct icmphdr *icmp4h = (struct icmphdr *)data; + + memcpy(data, in, len); + csum_icmp4(icmp4h, icmp4h + 1, len - sizeof(*icmp4h)); if (tap_send(c, buf, len + (data - buf)) < 0) debug("tap: failed to send %lu bytes (IPv4)", len); diff --git a/tap.h b/tap.h index d43c7a0..743bc58 100644 --- a/tap.h +++ b/tap.h @@ -7,10 +7,13 @@ #define TAP_H in_addr_t tap_ip4_daddr(const struct ctx *c); +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len); +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst, + void *in, size_t len); const struct in6_addr *tap_ip6_daddr(const struct ctx *c, const struct in6_addr *src); -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len); void tap_udp6_send(const struct ctx *c, const struct in6_addr *src, in_port_t sport, const struct in6_addr *dst, in_port_t dport, -- 2.37.3
On Mon, 17 Oct 2022 19:58:06 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:tap_ip4_send() has special case logic to compute the checksums for UDP and ICMP packets, which is a mild layering violation. By using a suitable helper we can split it into tap_udp4_send() and tap_icmp4_send() functions without greatly increasing the code size, this removing that layering violation. We make some small changes to the interface while there. In both cases we make the destination IPv4 address a parameter, which will be useful later. For the UDP variant we make it take just the UDP payload, and it will generate the UDP header. For the ICMP variant we pass in the ICMP header as before. The inconsistency is because that's what seems to be the more natural way to invoke the function in the callers in each case. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- icmp.c | 3 ++- tap.c | 75 +++++++++++++++++++++++++++++++++++++++++----------------- tap.h | 7 ++++-- 3 files changed, 60 insertions(+), 25 deletions(-) diff --git a/icmp.c b/icmp.c index 6493ea9..233acf9 100644 --- a/icmp.c +++ b/icmp.c @@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), + buf, n); } } diff --git a/tap.c b/tap.c index 274f4ba..5792880 100644 --- a/tap.c +++ b/tap.c @@ -127,20 +127,10 @@ static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto) return eh + 1; } -/** - * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums - * @c: Execution context - * @src: IPv4 source address - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - */ -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len)I understand why you return ip(4)h + 1 here because I've just reviewed 9/14, I wouldn't know otherwise: /** * tap_ip4_hdr() - Build IPv4 header for inbound packet, with checksum * @c: Execution context * @src: IPv4 source address, network order * @dst: IPv4 destination address, network order * @len: L4 payload length * @proto: L4 protocol number * * Return: pointer to write payload to */+static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst, + size_t len, uint8_t proto) { - char buf[USHRT_MAX]; - struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP); - char *data = (char *)(ip4h + 1); + struct iphdr *ip4h = (struct iphdr *)buf; ip4h->version = 4; ip4h->ihl = sizeof(struct iphdr) / 4; @@ -151,20 +141,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, ip4h->ttl = 255; ip4h->protocol = proto; ip4h->saddr = src; - ip4h->daddr = tap_ip4_daddr(c); + ip4h->daddr = dst; csum_ip4_header(ip4h); + return ip4h + 1; +} + +/** + * tap_udp4_send() - Send UDP over IPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @sport: UDP source port + * @dst: IPv4 destination address + * @dport: UDP destination port + * @in: UDP payload contents (not including UDP header) + * @len: UDP payload length (not including UDP header) + */ +/* cppcheck-suppress unusedFunction */ +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len) +{ + size_t udplen = len + sizeof(struct udphdr); + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + void *uhp = tap_ip4_hdr(ip4h, src, dst, udplen, IPPROTO_UDP);Two observations: - this saves one line and one cast, but it's really a bit unnatural that tap_ip4_hdr() doesn't point to the header it just made, or to nothing. I would rather have to +1 the return value or the original pointer instead or having this trick+ struct udphdr *uh = (struct udphdr *)uhp; + char *data = (char *)(uh + 1);- it's longer, but in my opinion clearer, if we split a bit more clearly the components of the packet, that is, something like (untested): char buf[USHRT_MAX]; struct udphdr *uh; struct iphdr *iph; char *data; iph = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP) + 1; tap_ip_hdr(iph, src, dst, len + sizeof(uh), IPPROTO_UDP); uh = (struct udphdr *)iph + 1; uh->source = htons(sport); uh->dest = htons(dport); uh->len = htons(len + sizeof(uh)); csum_udp4(uh, src, dst, in, len); data = uh + 1; memcpy(data, in, len); if (tap_send(c, buf, len + (data - buf)) < 0) debug("tap: failed to send %lu bytes (IPv4)", len);+ uh->source = htons(sport); + uh->dest = htons(dport); + uh->len = htons(udplen); + csum_udp4(uh, src, dst, in, len); memcpy(data, in, len); - if (ip4h->protocol == IPPROTO_UDP) { - struct udphdr *uh = (struct udphdr *)(ip4h + 1); + if (tap_send(c, buf, len + (data - buf)) < 0) + debug("tap: failed to send %lu bytes (IPv4)", len); +} - csum_udp4(uh, ip4h->saddr, ip4h->daddr, - uh + 1, len - sizeof(*uh)); - } else if (ip4h->protocol == IPPROTO_ICMP) { - struct icmphdr *ih = (struct icmphdr *)(ip4h + 1); - csum_icmp4(ih, ih + 1, len - sizeof(*ih)); - } +/** + * tap_icmp4_send() - Send ICMPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @dst: IPv4 destination address + * @in: ICMP packet, including ICMP header + * @len: ICMP packet length, including ICMP header + */ +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst, + void *in, size_t len) +{ + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + char *data = tap_ip4_hdr(ip4h, src, dst, len, IPPROTO_ICMP); + struct icmphdr *icmp4h = (struct icmphdr *)data;...same here, even though perhaps not so apparent.+ + memcpy(data, in, len); + csum_icmp4(icmp4h, icmp4h + 1, len - sizeof(*icmp4h)); if (tap_send(c, buf, len + (data - buf)) < 0) debug("tap: failed to send %lu bytes (IPv4)", len); diff --git a/tap.h b/tap.h index d43c7a0..743bc58 100644 --- a/tap.h +++ b/tap.h @@ -7,10 +7,13 @@ #define TAP_H in_addr_t tap_ip4_daddr(const struct ctx *c); +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len); +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst, + void *in, size_t len); const struct in6_addr *tap_ip6_daddr(const struct ctx *c, const struct in6_addr *src); -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len); void tap_udp6_send(const struct ctx *c, const struct in6_addr *src, in_port_t sport, const struct in6_addr *dst, in_port_t dport,-- Stefano
On Tue, Oct 18, 2022 at 05:06:34AM +0200, Stefano Brivio wrote:On Mon, 17 Oct 2022 19:58:06 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Oops, yes, forgot to add a function comment. Done.tap_ip4_send() has special case logic to compute the checksums for UDP and ICMP packets, which is a mild layering violation. By using a suitable helper we can split it into tap_udp4_send() and tap_icmp4_send() functions without greatly increasing the code size, this removing that layering violation. We make some small changes to the interface while there. In both cases we make the destination IPv4 address a parameter, which will be useful later. For the UDP variant we make it take just the UDP payload, and it will generate the UDP header. For the ICMP variant we pass in the ICMP header as before. The inconsistency is because that's what seems to be the more natural way to invoke the function in the callers in each case. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- icmp.c | 3 ++- tap.c | 75 +++++++++++++++++++++++++++++++++++++++++----------------- tap.h | 7 ++++-- 3 files changed, 60 insertions(+), 25 deletions(-) diff --git a/icmp.c b/icmp.c index 6493ea9..233acf9 100644 --- a/icmp.c +++ b/icmp.c @@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), + buf, n); } } diff --git a/tap.c b/tap.c index 274f4ba..5792880 100644 --- a/tap.c +++ b/tap.c @@ -127,20 +127,10 @@ static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto) return eh + 1; } -/** - * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums - * @c: Execution context - * @src: IPv4 source address - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - */ -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len)I understand why you return ip(4)h + 1 here because I've just reviewed 9/14, I wouldn't know otherwise: /** * tap_ip4_hdr() - Build IPv4 header for inbound packet, with checksum * @c: Execution context * @src: IPv4 source address, network order * @dst: IPv4 destination address, network order * @len: L4 payload length * @proto: L4 protocol number * * Return: pointer to write payload to */I don't really want to change this. Yes, it's a bit counterintuitive at first blush, but there's a reason for this approach. This style of a function which generates a header then points *after* it works even if the header it generates is of variable length. Advancing to the payload in the caller doesn't (at least not without breaking the abstraction I'm trying to generate with these helpers). That's not just theoretical, because at some point I'd like to extend the l2_hdr function to also allocate space for the qemu socket length header. I'm certainly open to name changes to make this behaviour more obvious, but I think returning the payload pointer not the header pointer makes for a better abstraction here.+static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst, + size_t len, uint8_t proto) { - char buf[USHRT_MAX]; - struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP); - char *data = (char *)(ip4h + 1); + struct iphdr *ip4h = (struct iphdr *)buf; ip4h->version = 4; ip4h->ihl = sizeof(struct iphdr) / 4; @@ -151,20 +141,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, ip4h->ttl = 255; ip4h->protocol = proto; ip4h->saddr = src; - ip4h->daddr = tap_ip4_daddr(c); + ip4h->daddr = dst; csum_ip4_header(ip4h); + return ip4h + 1; +} + +/** + * tap_udp4_send() - Send UDP over IPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @sport: UDP source port + * @dst: IPv4 destination address + * @dport: UDP destination port + * @in: UDP payload contents (not including UDP header) + * @len: UDP payload length (not including UDP header) + */ +/* cppcheck-suppress unusedFunction */ +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len) +{ + size_t udplen = len + sizeof(struct udphdr); + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + void *uhp = tap_ip4_hdr(ip4h, src, dst, udplen, IPPROTO_UDP);Two observations: - this saves one line and one cast, but it's really a bit unnatural that tap_ip4_hdr() doesn't point to the header it just made, or to nothing. I would rather have to +1 the return value or the original pointer instead or having this trick+ struct udphdr *uh = (struct udphdr *)uhp; + char *data = (char *)(uh + 1);- it's longer, but in my opinion clearer, if we split a bit more clearly the components of the packet, that is, something like (untested):char buf[USHRT_MAX]; struct udphdr *uh; struct iphdr *iph; char *data; iph = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP) + 1; tap_ip_hdr(iph, src, dst, len + sizeof(uh), IPPROTO_UDP); uh = (struct udphdr *)iph + 1; uh->source = htons(sport); uh->dest = htons(dport); uh->len = htons(len + sizeof(uh)); csum_udp4(uh, src, dst, in, len); data = uh + 1; memcpy(data, in, len); if (tap_send(c, buf, len + (data - buf)) < 0) debug("tap: failed to send %lu bytes (IPv4)", len);-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson+ uh->source = htons(sport); + uh->dest = htons(dport); + uh->len = htons(udplen); + csum_udp4(uh, src, dst, in, len); memcpy(data, in, len); - if (ip4h->protocol == IPPROTO_UDP) { - struct udphdr *uh = (struct udphdr *)(ip4h + 1); + if (tap_send(c, buf, len + (data - buf)) < 0) + debug("tap: failed to send %lu bytes (IPv4)", len); +} - csum_udp4(uh, ip4h->saddr, ip4h->daddr, - uh + 1, len - sizeof(*uh)); - } else if (ip4h->protocol == IPPROTO_ICMP) { - struct icmphdr *ih = (struct icmphdr *)(ip4h + 1); - csum_icmp4(ih, ih + 1, len - sizeof(*ih)); - } +/** + * tap_icmp4_send() - Send ICMPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @dst: IPv4 destination address + * @in: ICMP packet, including ICMP header + * @len: ICMP packet length, including ICMP header + */ +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst, + void *in, size_t len) +{ + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + char *data = tap_ip4_hdr(ip4h, src, dst, len, IPPROTO_ICMP); + struct icmphdr *icmp4h = (struct icmphdr *)data;...same here, even though perhaps not so apparent.+ + memcpy(data, in, len); + csum_icmp4(icmp4h, icmp4h + 1, len - sizeof(*icmp4h)); if (tap_send(c, buf, len + (data - buf)) < 0) debug("tap: failed to send %lu bytes (IPv4)", len); diff --git a/tap.h b/tap.h index d43c7a0..743bc58 100644 --- a/tap.h +++ b/tap.h @@ -7,10 +7,13 @@ #define TAP_H in_addr_t tap_ip4_daddr(const struct ctx *c); +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len); +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst, + void *in, size_t len); const struct in6_addr *tap_ip6_daddr(const struct ctx *c, const struct in6_addr *src); -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len); void tap_udp6_send(const struct ctx *c, const struct in6_addr *src, in_port_t sport, const struct in6_addr *dst, in_port_t dport,
On Tue, 18 Oct 2022 23:07:58 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Tue, Oct 18, 2022 at 05:06:34AM +0200, Stefano Brivio wrote:Hmm, yes, I think the variable length case is a very valid point, and also in terms of abstraction I see the advantage. There are just two things I can think of: - passing the end pointer as argument (not as practical as your solution, though) - naming it tap_ip4_push_hdr(), tap_ip4_hdr_after(), tap_ip4_hdr_goto_next(), tap_ip4_leave_header_behind()... I can't think of anything better at this point. I'll keep thinking, but at the moment I'd be fine even with the current name. -- StefanoOn Mon, 17 Oct 2022 19:58:06 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Oops, yes, forgot to add a function comment. Done.tap_ip4_send() has special case logic to compute the checksums for UDP and ICMP packets, which is a mild layering violation. By using a suitable helper we can split it into tap_udp4_send() and tap_icmp4_send() functions without greatly increasing the code size, this removing that layering violation. We make some small changes to the interface while there. In both cases we make the destination IPv4 address a parameter, which will be useful later. For the UDP variant we make it take just the UDP payload, and it will generate the UDP header. For the ICMP variant we pass in the ICMP header as before. The inconsistency is because that's what seems to be the more natural way to invoke the function in the callers in each case. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- icmp.c | 3 ++- tap.c | 75 +++++++++++++++++++++++++++++++++++++++++----------------- tap.h | 7 ++++-- 3 files changed, 60 insertions(+), 25 deletions(-) diff --git a/icmp.c b/icmp.c index 6493ea9..233acf9 100644 --- a/icmp.c +++ b/icmp.c @@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), + buf, n); } } diff --git a/tap.c b/tap.c index 274f4ba..5792880 100644 --- a/tap.c +++ b/tap.c @@ -127,20 +127,10 @@ static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto) return eh + 1; } -/** - * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums - * @c: Execution context - * @src: IPv4 source address - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - */ -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len)I understand why you return ip(4)h + 1 here because I've just reviewed 9/14, I wouldn't know otherwise: /** * tap_ip4_hdr() - Build IPv4 header for inbound packet, with checksum * @c: Execution context * @src: IPv4 source address, network order * @dst: IPv4 destination address, network order * @len: L4 payload length * @proto: L4 protocol number * * Return: pointer to write payload to */I don't really want to change this. Yes, it's a bit counterintuitive at first blush, but there's a reason for this approach. This style of a function which generates a header then points *after* it works even if the header it generates is of variable length. Advancing to the payload in the caller doesn't (at least not without breaking the abstraction I'm trying to generate with these helpers). That's not just theoretical, because at some point I'd like to extend the l2_hdr function to also allocate space for the qemu socket length header. I'm certainly open to name changes to make this behaviour more obvious, but I think returning the payload pointer not the header pointer makes for a better abstraction here.+static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst, + size_t len, uint8_t proto) { - char buf[USHRT_MAX]; - struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP); - char *data = (char *)(ip4h + 1); + struct iphdr *ip4h = (struct iphdr *)buf; ip4h->version = 4; ip4h->ihl = sizeof(struct iphdr) / 4; @@ -151,20 +141,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, ip4h->ttl = 255; ip4h->protocol = proto; ip4h->saddr = src; - ip4h->daddr = tap_ip4_daddr(c); + ip4h->daddr = dst; csum_ip4_header(ip4h); + return ip4h + 1; +} + +/** + * tap_udp4_send() - Send UDP over IPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @sport: UDP source port + * @dst: IPv4 destination address + * @dport: UDP destination port + * @in: UDP payload contents (not including UDP header) + * @len: UDP payload length (not including UDP header) + */ +/* cppcheck-suppress unusedFunction */ +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len) +{ + size_t udplen = len + sizeof(struct udphdr); + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + void *uhp = tap_ip4_hdr(ip4h, src, dst, udplen, IPPROTO_UDP);Two observations: - this saves one line and one cast, but it's really a bit unnatural that tap_ip4_hdr() doesn't point to the header it just made, or to nothing. I would rather have to +1 the return value or the original pointer instead or having this trick+ struct udphdr *uh = (struct udphdr *)uhp; + char *data = (char *)(uh + 1);- it's longer, but in my opinion clearer, if we split a bit more clearly the components of the packet, that is, something like (untested):
On Tue, Oct 18, 2022 at 02:27:04PM +0200, Stefano Brivio wrote:On Tue, 18 Oct 2022 23:07:58 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:I've gone with a variant of the 'push' naming, I think that makes it a bit clearer. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonOn Tue, Oct 18, 2022 at 05:06:34AM +0200, Stefano Brivio wrote:Hmm, yes, I think the variable length case is a very valid point, and also in terms of abstraction I see the advantage. There are just two things I can think of: - passing the end pointer as argument (not as practical as your solution, though) - naming it tap_ip4_push_hdr(), tap_ip4_hdr_after(), tap_ip4_hdr_goto_next(), tap_ip4_leave_header_behind()... I can't think of anything better at this point. I'll keep thinking, but at the moment I'd be fine even with the current name.On Mon, 17 Oct 2022 19:58:06 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Oops, yes, forgot to add a function comment. Done.tap_ip4_send() has special case logic to compute the checksums for UDP and ICMP packets, which is a mild layering violation. By using a suitable helper we can split it into tap_udp4_send() and tap_icmp4_send() functions without greatly increasing the code size, this removing that layering violation. We make some small changes to the interface while there. In both cases we make the destination IPv4 address a parameter, which will be useful later. For the UDP variant we make it take just the UDP payload, and it will generate the UDP header. For the ICMP variant we pass in the ICMP header as before. The inconsistency is because that's what seems to be the more natural way to invoke the function in the callers in each case. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- icmp.c | 3 ++- tap.c | 75 +++++++++++++++++++++++++++++++++++++++++----------------- tap.h | 7 ++++-- 3 files changed, 60 insertions(+), 25 deletions(-) diff --git a/icmp.c b/icmp.c index 6493ea9..233acf9 100644 --- a/icmp.c +++ b/icmp.c @@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } - tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), + buf, n); } } diff --git a/tap.c b/tap.c index 274f4ba..5792880 100644 --- a/tap.c +++ b/tap.c @@ -127,20 +127,10 @@ static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto) return eh + 1; } -/** - * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums - * @c: Execution context - * @src: IPv4 source address - * @proto: L4 protocol number - * @in: Payload - * @len: L4 payload length - */ -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, - const char *in, size_t len)I understand why you return ip(4)h + 1 here because I've just reviewed 9/14, I wouldn't know otherwise: /** * tap_ip4_hdr() - Build IPv4 header for inbound packet, with checksum * @c: Execution context * @src: IPv4 source address, network order * @dst: IPv4 destination address, network order * @len: L4 payload length * @proto: L4 protocol number * * Return: pointer to write payload to */I don't really want to change this. Yes, it's a bit counterintuitive at first blush, but there's a reason for this approach. This style of a function which generates a header then points *after* it works even if the header it generates is of variable length. Advancing to the payload in the caller doesn't (at least not without breaking the abstraction I'm trying to generate with these helpers). That's not just theoretical, because at some point I'd like to extend the l2_hdr function to also allocate space for the qemu socket length header. I'm certainly open to name changes to make this behaviour more obvious, but I think returning the payload pointer not the header pointer makes for a better abstraction here.+static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst, + size_t len, uint8_t proto) { - char buf[USHRT_MAX]; - struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP); - char *data = (char *)(ip4h + 1); + struct iphdr *ip4h = (struct iphdr *)buf; ip4h->version = 4; ip4h->ihl = sizeof(struct iphdr) / 4; @@ -151,20 +141,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, ip4h->ttl = 255; ip4h->protocol = proto; ip4h->saddr = src; - ip4h->daddr = tap_ip4_daddr(c); + ip4h->daddr = dst; csum_ip4_header(ip4h); + return ip4h + 1; +} + +/** + * tap_udp4_send() - Send UDP over IPv4 packet + * @c: Execution context + * @src: IPv4 source address + * @sport: UDP source port + * @dst: IPv4 destination address + * @dport: UDP destination port + * @in: UDP payload contents (not including UDP header) + * @len: UDP payload length (not including UDP header) + */ +/* cppcheck-suppress unusedFunction */ +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, + in_addr_t dst, in_port_t dport, + const void *in, size_t len) +{ + size_t udplen = len + sizeof(struct udphdr); + char buf[USHRT_MAX]; + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); + void *uhp = tap_ip4_hdr(ip4h, src, dst, udplen, IPPROTO_UDP);Two observations: - this saves one line and one cast, but it's really a bit unnatural that tap_ip4_hdr() doesn't point to the header it just made, or to nothing. I would rather have to +1 the return value or the original pointer instead or having this trick+ struct udphdr *uh = (struct udphdr *)uhp; + char *data = (char *)(uh + 1);- it's longer, but in my opinion clearer, if we split a bit more clearly the components of the packet, that is, something like (untested):
The IPv4 specific dhcp() manually constructs L2 and IP headers to send its DHCP reply packet, unlike its IPv6 equivalent in dhcpv6.c which uses the tap_udp6_send() helper. Now that we've broaded the parameters to tap_udp4_send() we can use it in dhcp() to avoid some duplicated logic. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- dhcp.c | 18 ++---------------- tap.c | 1 - 2 files changed, 2 insertions(+), 17 deletions(-) diff --git a/dhcp.c b/dhcp.c index 2b3af82..d22698a 100644 --- a/dhcp.c +++ b/dhcp.c @@ -363,22 +363,8 @@ int dhcp(const struct ctx *c, const struct pool *p) if (!c->no_dhcp_dns_search) opt_set_dns_search(c, sizeof(m->o)); - uh->len = htons(len = offsetof(struct msg, o) + fill(m) + sizeof(*uh)); - uh->source = htons(67); - uh->dest = htons(68); - csum_udp4(uh, c->ip4.gw, c->ip4.addr, uh + 1, len - sizeof(*uh)); - - iph->tot_len = htons(len += sizeof(*iph)); - iph->daddr = c->ip4.addr; - iph->saddr = c->ip4.gw; - csum_ip4_header(iph); - - len += sizeof(*eh); - memcpy(eh->h_dest, eh->h_source, ETH_ALEN); - memcpy(eh->h_source, c->mac, ETH_ALEN); - - if (tap_send(c, eh, len) < 0) - perror("DHCP: send"); + len = offsetof(struct msg, o) + fill(m); + tap_udp4_send(c, c->ip4.gw, 67, c->ip4.addr, 68, m, len); return 1; } diff --git a/tap.c b/tap.c index 5792880..75f1e38 100644 --- a/tap.c +++ b/tap.c @@ -156,7 +156,6 @@ static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst, * @in: UDP payload contents (not including UDP header) * @len: UDP payload length (not including UDP header) */ -/* cppcheck-suppress unusedFunction */ void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, in_addr_t dst, in_port_t dport, const void *in, size_t len) -- 2.37.3