[PATCH v2 00/11] Improve robustness of calculations related to frame size limits

David Gibson

17 Mar 2025 17 Mar '25

10:24 a.m.

There are a number of places where we make calculations and checks around how large frames can be and where they sit in memory. Several of these are roughly correct, but can be wrong in certain edge cases. Improve robustness by clarifying what we're doing and being more careful about the edge cases. v2: * Added additional patches 5..11 * Patches 1..4 rebased but unchanged David Gibson (11): vu_common: Tighten vu_packet_check_range() packet: More cautious checks to avoid pointer arithmetic UB tap: Make size of pool_tap[46] purely a tuning parameter tap: Clarify calculation of TAP_MSGS packet: Correct type of PACKET_MAX_LEN packet: Avoid integer overflows in packet_get_do() packet: Move checks against PACKET_MAX_LEN to packet_check_range() packet: Rework packet_get() versus packet_get_try() util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers packet: ASSERT on signs of pool corruption packet: Upgrade severity of most packet errors packet.c | 110 ++++++++++++++++++++++++++++++++++------------------ packet.h | 13 +++++-- passt.h | 2 - tap.c | 43 ++++++++++++++++---- tap.h | 3 +- util.c | 19 +++++++++ util.h | 25 +++++------- vu_common.c | 15 ++++--- 8 files changed, 158 insertions(+), 72 deletions(-) -- 2.48.1

Show replies by date

David Gibson

17 Mar 17 Mar

10:24 a.m.

New subject: [PATCH v2 01/11] vu_common: Tighten vu_packet_check_range()

This function verifies that the given packet is within the mmap()ed memory region of the vhost-user device. We can do better, however. The packet should be not only within the mmap()ed range, but specifically in the subsection of that range set aside for shared buffers, which starts at dev_region->mmap_offset within there. Signed-off-by: David Gibson --- vu_common.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/vu_common.c b/vu_common.c index 686a09b2..9eea4f2f 100644 --- a/vu_common.c +++ b/vu_common.c @@ -37,10 +37,10 @@ int vu_packet_check_range(void *buf, const char *ptr, size_t len) for (dev_region = buf; dev_region->mmap_addr; dev_region++) { /* NOLINTNEXTLINE(performance-no-int-to-ptr) */ - char *m = (char *)(uintptr_t)dev_region->mmap_addr; + char *m = (char *)(uintptr_t)dev_region->mmap_addr + + dev_region->mmap_offset; - if (m <= ptr && - ptr + len <= m + dev_region->mmap_offset + dev_region->size) + if (m <= ptr && ptr + len <= m + dev_region->size) return 0; } -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 02/11] packet: More cautious checks to avoid pointer arithmetic UB

packet_check_range and vu_packet_check_range() verify that the packet or section of packet we're interested in lies in the packet buffer pool we expect it to. However, in doing so it doesn't avoid the possibility of an integer overflow while performing pointer arithmetic, with is UB. In fact, AFAICT it's UB even to use arbitrary pointer arithmetic to construct a pointer outside of a known valid buffer. To do this safely, we can't calculate the end of a memory region with pointer addition when then the length as untrusted. Instead we must work out the offset of one memory region within another using pointer subtraction, then do integer checks against the length of the outer region. We then need to be careful about the order of checks so that those integer checks can't themselves overflow. Signed-off-by: David Gibson --- packet.c | 12 +++++++++--- vu_common.c | 10 +++++++--- 2 files changed, 16 insertions(+), 6 deletions(-) diff --git a/packet.c b/packet.c index bcac0375..d1a51a5b 100644 --- a/packet.c +++ b/packet.c @@ -52,9 +52,15 @@ static int packet_check_range(const struct pool *p, const char *ptr, size_t len, return -1; } - if (ptr + len > p->buf + p->buf_size) { - trace("packet range end %p after buffer end %p, %s:%i", - (void *)(ptr + len), (void *)(p->buf + p->buf_size), + if (len > p->buf_size) { + trace("packet range length %zu larger than buffer %zu, %s:%i", + len, p->buf_size, func, line); + return -1; + } + + if ((size_t)(ptr - p->buf) > p->buf_size - len) { + trace("packet range %p, len %zu after buffer end %p, %s:%i", + (void *)ptr, len, (void *)(p->buf + p->buf_size), func, line); return -1; } diff --git a/vu_common.c b/vu_common.c index 9eea4f2f..cefe5e20 100644 --- a/vu_common.c +++ b/vu_common.c @@ -36,11 +36,15 @@ int vu_packet_check_range(void *buf, const char *ptr, size_t len) struct vu_dev_region *dev_region; for (dev_region = buf; dev_region->mmap_addr; dev_region++) { - /* NOLINTNEXTLINE(performance-no-int-to-ptr) */ - char *m = (char *)(uintptr_t)dev_region->mmap_addr + + uintptr_t base_addr = dev_region->mmap_addr + dev_region->mmap_offset; + /* NOLINTNEXTLINE(performance-no-int-to-ptr) */ + const char *base = (const char *)base_addr; + + ASSERT(base_addr >= dev_region->mmap_addr); - if (m <= ptr && ptr + len <= m + dev_region->size) + if (len <= dev_region->size && base <= ptr && + (size_t)(ptr - base) <= dev_region->size - len) return 0; } -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 03/11] tap: Make size of pool_tap[46] purely a tuning parameter

Currently we attempt to size pool_tap[46] so they have room for the maximum possible number of packets that could fit in pkt_buf (TAP_MSGS). However, the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as the minimum possible L2 frame size. But ETH_ZLEN is based on physical constraints of Ethernet, which don't apply to our virtual devices. It is possible to generate a legitimate frame smaller than this, for example an empty payload UDP/IPv4 frame on the 'pasta' backend is only 42 bytes long. Further more, the same limit applies for vhost-user, which is not limited by the size of pkt_buf like the other backends. In that case we don't even have full control of the maximum buffer size, so we can't really calculate how many packets could fit in there. If we exceed do TAP_MSGS we'll drop packets, not just use more batches, which is moderately bad. The fact that this needs to be sized just so for correctness not merely for tuning is a fairly non-obvious coupling between different parts of the code. To make this more robust, alter the tap code so it doesn't rely on everything fitting in a single batch of TAP_MSGS packets, instead breaking into multiple batches as necessary. This leaves TAP_MSGS as purely a tuning parameter, which we can freely adjust based on performance measures. Signed-off-by: David Gibson --- packet.c | 13 ++++++++++++- packet.h | 3 +++ passt.h | 2 -- tap.c | 19 ++++++++++++++++--- tap.h | 3 ++- vu_common.c | 5 +++-- 6 files changed, 36 insertions(+), 9 deletions(-) diff --git a/packet.c b/packet.c index d1a51a5b..08076d57 100644 --- a/packet.c +++ b/packet.c @@ -67,6 +67,17 @@ static int packet_check_range(const struct pool *p, const char *ptr, size_t len, return 0; } +/** + * pool_full() - Is a packet pool full? + * @p: Pointer to packet pool + * + * Return: true if the pool is full, false if more packets can be added + */ +bool pool_full(const struct pool *p) +{ + return p->count >= p->size; +} + /** * packet_add_do() - Add data as packet descriptor to given pool * @p: Existing pool @@ -80,7 +91,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start, { size_t idx = p->count; - if (idx >= p->size) { + if (pool_full(p)) { trace("add packet index %zu to pool with size %zu, %s:%i", idx, p->size, func, line); return; diff --git a/packet.h b/packet.h index d099f026..dd18461b 100644 --- a/packet.h +++ b/packet.h @@ -6,6 +6,8 @@ #ifndef PACKET_H #define PACKET_H +#include + /* Maximum size of a single packet stored in pool, including headers */ #define PACKET_MAX_LEN UINT16_MAX @@ -33,6 +35,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start, void *packet_get_do(const struct pool *p, const size_t idx, size_t offset, size_t len, size_t *left, const char *func, int line); +bool pool_full(const struct pool *p); void pool_flush(struct pool *p); #define packet_add(p, len, start) \ diff --git a/passt.h b/passt.h index 8f450912..8693794b 100644 --- a/passt.h +++ b/passt.h @@ -71,8 +71,6 @@ static_assert(sizeof(union epoll_ref) <= sizeof(union epoll_data), /* Large enough for ~128 maximum size frames */ #define PKT_BUF_BYTES (8UL << 20) -#define TAP_MSGS \ - DIV_ROUND_UP(PKT_BUF_BYTES, ETH_ZLEN - 2 * ETH_ALEN + sizeof(uint32_t)) extern char pkt_buf [PKT_BUF_BYTES]; diff --git a/tap.c b/tap.c index 182a1151..34e6774f 100644 --- a/tap.c +++ b/tap.c @@ -75,6 +75,9 @@ CHECK_FRAME_LEN(L2_MAX_LEN_PASTA); CHECK_FRAME_LEN(L2_MAX_LEN_PASST); CHECK_FRAME_LEN(L2_MAX_LEN_VU); +#define TAP_MSGS \ + DIV_ROUND_UP(sizeof(pkt_buf), ETH_ZLEN - 2 * ETH_ALEN + sizeof(uint32_t)) + /* IPv4 (plus ARP) and IPv6 message batches from tap/guest to IP handlers */ static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf); static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf); @@ -1042,8 +1045,10 @@ void tap_handler(struct ctx *c, const struct timespec *now) * @c: Execution context * @l2len: Total L2 packet length * @p: Packet buffer + * @now: Current timestamp */ -void tap_add_packet(struct ctx *c, ssize_t l2len, char *p) +void tap_add_packet(struct ctx *c, ssize_t l2len, char *p, + const struct timespec *now) { const struct ethhdr *eh; @@ -1059,9 +1064,17 @@ void tap_add_packet(struct ctx *c, ssize_t l2len, char *p) switch (ntohs(eh->h_proto)) { case ETH_P_ARP: case ETH_P_IP: + if (pool_full(pool_tap4)) { + tap4_handler(c, pool_tap4, now); + pool_flush(pool_tap4); + } packet_add(pool_tap4, l2len, p); break; case ETH_P_IPV6: + if (pool_full(pool_tap6)) { + tap6_handler(c, pool_tap6, now); + pool_flush(pool_tap6); + } packet_add(pool_tap6, l2len, p); break; default: @@ -1142,7 +1155,7 @@ static void tap_passt_input(struct ctx *c, const struct timespec *now) p += sizeof(uint32_t); n -= sizeof(uint32_t); - tap_add_packet(c, l2len, p); + tap_add_packet(c, l2len, p, now); p += l2len; n -= l2len; @@ -1207,7 +1220,7 @@ static void tap_pasta_input(struct ctx *c, const struct timespec *now) len > (ssize_t)L2_MAX_LEN_PASTA) continue; - tap_add_packet(c, len, pkt_buf + n); + tap_add_packet(c, len, pkt_buf + n, now); } tap_handler(c, now); diff --git a/tap.h b/tap.h index dd39fd89..6fe3d15d 100644 --- a/tap.h +++ b/tap.h @@ -119,6 +119,7 @@ void tap_sock_update_pool(void *base, size_t size); void tap_backend_init(struct ctx *c); void tap_flush_pools(void); void tap_handler(struct ctx *c, const struct timespec *now); -void tap_add_packet(struct ctx *c, ssize_t l2len, char *p); +void tap_add_packet(struct ctx *c, ssize_t l2len, char *p, + const struct timespec *now); #endif /* TAP_H */ diff --git a/vu_common.c b/vu_common.c index cefe5e20..5e6fd4a8 100644 --- a/vu_common.c +++ b/vu_common.c @@ -195,7 +195,7 @@ static void vu_handle_tx(struct vu_dev *vdev, int index, tap_add_packet(vdev->context, elem[count].out_sg[0].iov_len - hdrlen, (char *)elem[count].out_sg[0].iov_base + - hdrlen); + hdrlen, now); } else { /* vnet header can be in a separate iovec */ if (elem[count].out_num != 2) { @@ -207,7 +207,8 @@ static void vu_handle_tx(struct vu_dev *vdev, int index, } else { tap_add_packet(vdev->context, elem[count].out_sg[1].iov_len, - (char *)elem[count].out_sg[1].iov_base); + (char *)elem[count].out_sg[1].iov_base, + now); } } -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 04/11] tap: Clarify calculation of TAP_MSGS

The rationale behind the calculation of TAP_MSGS isn't necessarily obvious. It's supposed to be the maximum number of packets that can fit in pkt_buf. However, the calculation is wrong in several ways: * It's based on ETH_ZLEN which isn't meaningful for virtual devices * It always includes the qemu socket header which isn't used for pasta * The size of pkt_buf isn't relevant for vhost-user We've already made sure this is just a tuning parameter, not a hard limit. Clarify what we're calculating here and why. Signed-off-by: David Gibson --- tap.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/tap.c b/tap.c index 34e6774f..3a6fcbe8 100644 --- a/tap.c +++ b/tap.c @@ -75,12 +75,28 @@ CHECK_FRAME_LEN(L2_MAX_LEN_PASTA); CHECK_FRAME_LEN(L2_MAX_LEN_PASST); CHECK_FRAME_LEN(L2_MAX_LEN_VU); -#define TAP_MSGS \ - DIV_ROUND_UP(sizeof(pkt_buf), ETH_ZLEN - 2 * ETH_ALEN + sizeof(uint32_t)) +/* We try size the packet pools so that we can use a single batch for the entire + * packet buffer. This might be exceeded for vhost-user, though, which uses its + * own buffers rather than pkt_buf. + * + * This is just a tuning parameter, the code will work with slightly more + * overhead if it's incorrect. So, we estimate based on the minimum practical + * frame size - an empty UDP datagram - rather than the minimum theoretical + * frame size. + * + * FIXME: Profile to work out how big this actually needs to be to amortise + * per-batch syscall overheads + */ +#define TAP_MSGS_IP4 \ + DIV_ROUND_UP(sizeof(pkt_buf), \ + ETH_HLEN + sizeof(struct iphdr) + sizeof(struct udphdr)) +#define TAP_MSGS_IP6 \ + DIV_ROUND_UP(sizeof(pkt_buf), \ + ETH_HLEN + sizeof(struct ipv6hdr) + sizeof(struct udphdr)) /* IPv4 (plus ARP) and IPv6 message batches from tap/guest to IP handlers */ -static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf); -static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf); +static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS_IP4, pkt_buf); +static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS_IP6, pkt_buf); #define TAP_SEQS 128 /* Different L4 tuples in one batch */ #define FRAGMENT_MSG_RATE 10 /* # seconds between fragment warnings */ @@ -1418,8 +1434,8 @@ void tap_sock_update_pool(void *base, size_t size) { int i; - pool_tap4_storage = PACKET_INIT(pool_tap4, TAP_MSGS, base, size); - pool_tap6_storage = PACKET_INIT(pool_tap6, TAP_MSGS, base, size); + pool_tap4_storage = PACKET_INIT(pool_tap4, TAP_MSGS_IP4, base, size); + pool_tap6_storage = PACKET_INIT(pool_tap6, TAP_MSGS_IP6, base, size); for (i = 0; i < TAP_SEQS; i++) { tap4_l4[i].p = PACKET_INIT(pool_l4, UIO_MAXIOV, base, size); -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 05/11] packet: Correct type of PACKET_MAX_LEN

PACKET_MAX_LEN is usually involved in calculations on size_t values - the type of the iov_len field in struct iovec. However, being defined bare as UINT16_MAX, the compiled is likely to assign it a shorter type. This can lead to unexpected promotions (or lack thereof). Add a cast to force the type to be what we expect. Fixes: c43972ad6 ("packet: Give explicit name to maximum packet size") Signed-off-by: David Gibson --- packet.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/packet.h b/packet.h index dd18461b..9061dad7 100644 --- a/packet.h +++ b/packet.h @@ -9,7 +9,7 @@ #include /* Maximum size of a single packet stored in pool, including headers */ -#define PACKET_MAX_LEN UINT16_MAX +#define PACKET_MAX_LEN ((size_t)UINT16_MAX) /** * struct pool - Generic pool of packets stored in a buffer -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 06/11] packet: Avoid integer overflows in packet_get_do()

In packet_get_do() both offset and len are essentially untrusted. We do some validation of len (check it's < PACKET_MAX_LEN), but that's not enough to ensure that (len + offset) doesn't overflow. Rearrange our calculation to make sure it's safe regardless of the given offset & len values. Signed-off-by: David Gibson --- packet.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/packet.c b/packet.c index 08076d57..fdc4be76 100644 --- a/packet.c +++ b/packet.c @@ -144,7 +144,8 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset, return NULL; } - if (len + offset > p->pkt[idx].iov_len) { + if (offset > p->pkt[idx].iov_len || + len > (p->pkt[idx].iov_len - offset)) { if (func) { trace("data length %zu, offset %zu from length %zu, " "%s:%i", len, offset, p->pkt[idx].iov_len, -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 07/11] packet: Move checks against PACKET_MAX_LEN to packet_check_range()

Both the callers of packet_check_range() separately verify that the given length does not exceed PACKET_MAX_LEN. Fold that check into packet_check_range() instead. Signed-off-by: David Gibson --- packet.c | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-) diff --git a/packet.c b/packet.c index fdc4be76..7cbe95da 100644 --- a/packet.c +++ b/packet.c @@ -35,6 +35,12 @@ static int packet_check_range(const struct pool *p, const char *ptr, size_t len, const char *func, int line) { + if (len > PACKET_MAX_LEN) { + trace("packet range length %zu (max %zu), %s:%i", + len, PACKET_MAX_LEN, func, line); + return -1; + } + if (p->buf_size == 0) { int ret; @@ -100,11 +106,6 @@ void packet_add_do(struct pool *p, size_t len, const char *start, if (packet_check_range(p, start, len, func, line)) return; - if (len > PACKET_MAX_LEN) { - trace("add packet length %zu, %s:%i", len, func, line); - return; - } - p->pkt[idx].iov_base = (void *)start; p->pkt[idx].iov_len = len; @@ -136,14 +137,6 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset, return NULL; } - if (len > PACKET_MAX_LEN) { - if (func) { - trace("packet data length %zu, %s:%i", - len, func, line); - } - return NULL; - } - if (offset > p->pkt[idx].iov_len || len > (p->pkt[idx].iov_len - offset)) { if (func) { -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 08/11] packet: Rework packet_get() versus packet_get_try()

Most failures of packet_get() indicate a serious problem, and log messages accordingly. However, a few callers expect failures here, because they're probing for a certain range which might or might not be in a packet. They use packet_get_try() which passes a NULL func to packet_get_do() to suppress the logging which is unwanted in this case. However, this doesn't just suppress the log when packet_get_do() finds the requested region isn't in the packet. It suppresses logging for all other errors too, which do indicate serious problems, even for the callers of packet_get_try(). Worse it will pass the NULL func on to packet_check_range() which doesn't expect it, meaning we'll get unhelpful messages from there if there is a failure. Fix this by making packet_get_try_do() the primary function which doesn't log for the case of a range outside the packet. packet_get_do() becomes a trivial wrapper around that which logs a message if packet_get_try_do() returns NULL. Signed-off-by: David Gibson --- packet.c | 51 +++++++++++++++++++++++++++++++++++---------------- packet.h | 8 +++++--- 2 files changed, 40 insertions(+), 19 deletions(-) diff --git a/packet.c b/packet.c index 7cbe95da..b3e8c79e 100644 --- a/packet.c +++ b/packet.c @@ -89,7 +89,7 @@ bool pool_full(const struct pool *p) * @p: Existing pool * @len: Length of new descriptor * @start: Start of data - * @func: For tracing: name of calling function, NULL means no trace() + * @func: For tracing: name of calling function * @line: For tracing: caller line of function call */ void packet_add_do(struct pool *p, size_t len, const char *start, @@ -113,39 +113,31 @@ void packet_add_do(struct pool *p, size_t len, const char *start, } /** - * packet_get_do() - Get data range from packet descriptor from given pool + * packet_get_try_do() - Get data range from packet descriptor from given pool * @p: Packet pool * @idx: Index of packet descriptor in pool * @offset: Offset of data range in packet descriptor * @len: Length of desired data range * @left: Length of available data after range, set on return, can be NULL - * @func: For tracing: name of calling function, NULL means no trace() + * @func: For tracing: name of calling function * @line: For tracing: caller line of function call * * Return: pointer to start of data range, NULL on invalid range or descriptor */ -void *packet_get_do(const struct pool *p, size_t idx, size_t offset, - size_t len, size_t *left, const char *func, int line) +void *packet_get_try_do(const struct pool *p, size_t idx, size_t offset, + size_t len, size_t *left, const char *func, int line) { char *ptr; if (idx >= p->size || idx >= p->count) { - if (func) { - trace("packet %zu from pool size: %zu, count: %zu, " - "%s:%i", idx, p->size, p->count, func, line); - } + trace("packet %zu from pool size: %zu, count: %zu, %s:%i", + idx, p->size, p->count, func, line); return NULL; } if (offset > p->pkt[idx].iov_len || - len > (p->pkt[idx].iov_len - offset)) { - if (func) { - trace("data length %zu, offset %zu from length %zu, " - "%s:%i", len, offset, p->pkt[idx].iov_len, - func, line); - } + len > (p->pkt[idx].iov_len - offset)) return NULL; - } ptr = (char *)p->pkt[idx].iov_base + offset; @@ -158,6 +150,33 @@ void *packet_get_do(const struct pool *p, size_t idx, size_t offset, return ptr; } +/** + * packet_get_do() - Get data range from packet descriptor from given pool + * @p: Packet pool + * @idx: Index of packet descriptor in pool + * @offset: Offset of data range in packet descriptor + * @len: Length of desired data range + * @left: Length of available data after range, set on return, can be NULL + * @func: For tracing: name of calling function + * @line: For tracing: caller line of function call + * + * Return: as packet_get_try_do() but log a trace message when returning NULL + */ +void *packet_get_do(const struct pool *p, const size_t idx, + size_t offset, size_t len, size_t *left, + const char *func, int line) +{ + void *r = packet_get_try_do(p, idx, offset, len, left, func, line); + + if (!r) { + trace("missing packet data length %zu, offset %zu from " + "length %zu, %s:%i", + len, offset, p->pkt[idx].iov_len, func, line); + } + + return r; +} + /** * pool_flush() - Flush a packet pool * @p: Pointer to packet pool diff --git a/packet.h b/packet.h index 9061dad7..c94780a5 100644 --- a/packet.h +++ b/packet.h @@ -32,6 +32,9 @@ struct pool { int vu_packet_check_range(void *buf, const char *ptr, size_t len); void packet_add_do(struct pool *p, size_t len, const char *start, const char *func, int line); +void *packet_get_try_do(const struct pool *p, const size_t idx, + size_t offset, size_t len, size_t *left, + const char *func, int line); void *packet_get_do(const struct pool *p, const size_t idx, size_t offset, size_t len, size_t *left, const char *func, int line); @@ -41,12 +44,11 @@ void pool_flush(struct pool *p); #define packet_add(p, len, start) \ packet_add_do(p, len, start, __func__, __LINE__) +#define packet_get_try(p, idx, offset, len, left) \ + packet_get_try_do(p, idx, offset, len, left, __func__, __LINE__) #define packet_get(p, idx, offset, len, left) \ packet_get_do(p, idx, offset, len, left, __func__, __LINE__) -#define packet_get_try(p, idx, offset, len, left) \ - packet_get_do(p, idx, offset, len, left, NULL, 0) - #define PACKET_POOL_DECL(_name, _size, _buf) \ struct _name ## _t { \ char *buf; \ -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 09/11] util: Add abort_with_msg() and ASSERT_WITH_MSG() helpers

We already have the ASSERT() macro which will abort() passt based on a condition. It always has a fixed error message based on its location and the asserted expression. We have some upcoming cases where we want to customise the message when hitting an assert. Add abort_with_msg() and ASSERT_WITH_MSG() helpers to allow this. Signed-off-by: David Gibson --- util.c | 19 +++++++++++++++++++ util.h | 25 ++++++++++--------------- 2 files changed, 29 insertions(+), 15 deletions(-) diff --git a/util.c b/util.c index 656e86ad..b9a3d434 100644 --- a/util.c +++ b/util.c @@ -1017,3 +1017,22 @@ void encode_domain_name(char *buf, const char *domain_name) } p[i] = 0L; } + +/** + * abort_with_msg() - Print error message and abort + * @fmt: Format string + * @...: Format parameters + */ +void abort_with_msg(const char *fmt, ...) +{ + va_list ap; + + va_start(ap, fmt); + vlogmsg(true, false, LOG_CRIT, fmt, ap); + va_end(ap); + + /* This may actually cause a SIGSYS instead of SIGABRT, due to seccomp, + * but that will still get the job done. + */ + abort(); +} diff --git a/util.h b/util.h index 4d512fab..b1e7e79a 100644 --- a/util.h +++ b/util.h @@ -61,27 +61,22 @@ #define STRINGIFY(x) #x #define STR(x) STRINGIFY(x) -#ifdef CPPCHECK_6936 +void abort_with_msg(const char *fmt, ...) + __attribute__((format(printf, 1, 2), noreturn)); + /* Some cppcheck versions get confused by aborts inside a loop, causing * it to give false positive uninitialised variable warnings later in * the function, because it doesn't realise the non-initialising path * already exited. See https://trac.cppcheck.net/ticket/13227 + * + * Therefore, avoid using the usual do while wrapper we use to force the macro + * to act like a single statement requiring a ';'. */ -#define ASSERT(expr) \ - ((expr) ? (void)0 : abort()) -#else +#define ASSERT_WITH_MSG(expr, ...) \ + ((expr) ? (void)0 : abort_with_msg(__VA_ARGS__)) #define ASSERT(expr) \ - do { \ - if (!(expr)) { \ - err("ASSERTION FAILED in %s (%s:%d): %s", \ - __func__, __FILE__, __LINE__, STRINGIFY(expr)); \ - /* This may actually SIGSYS, due to seccomp, \ - * but that will still get the job done \ - */ \ - abort(); \ - } \ - } while (0) -#endif + ASSERT_WITH_MSG((expr), "ASSSERTION FAILED in %s (%s:%d): %s", \ + __func__, __FILE__, __LINE__, STRINGIFY(expr)) #ifdef P_tmpdir #define TMPDIR P_tmpdir -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 10/11] packet: ASSERT on signs of pool corruption

If packet_check_range() fails in packet_get_try_do() we just return NULL. But this check only takes places after we've already validated the given range against the packet it's in. That means that if packet_check_range() fails, the packet pool is already in a corrupted state (we should have made strictly stronger checks when the packet was added). Simply returning NULL and logging a trace() level message isn't really adequate for that situation; ASSERT instead. Similarly we check the given idx against both p->count and p->size. The latter should be redundant, because count should always be <= size. If that's not the case then, again, the pool is already in a corrupted state and we may have overwritten unknown memory. Assert for this case too. Signed-off-by: David Gibson --- packet.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/packet.c b/packet.c index b3e8c79e..be28f279 100644 --- a/packet.c +++ b/packet.c @@ -129,9 +129,13 @@ void *packet_get_try_do(const struct pool *p, size_t idx, size_t offset, { char *ptr; - if (idx >= p->size || idx >= p->count) { - trace("packet %zu from pool size: %zu, count: %zu, %s:%i", - idx, p->size, p->count, func, line); + ASSERT_WITH_MSG(p->count <= p->size, + "Corrupt pool count: %zu, size: %zu, %s:%i", + p->count, p->size, func, line); + + if (idx >= p->count) { + trace("packet %zu from pool count: %zu, %s:%i", + idx, p->count, func, line); return NULL; } @@ -141,8 +145,8 @@ void *packet_get_try_do(const struct pool *p, size_t idx, size_t offset, ptr = (char *)p->pkt[idx].iov_base + offset; - if (packet_check_range(p, ptr, len, func, line)) - return NULL; + ASSERT_WITH_MSG(!packet_check_range(p, ptr, len, func, line), + "Corrupt packet pool, %s:%i", func, line); if (left) *left = p->pkt[idx].iov_len - offset - len; -- 2.48.1

David Gibson

10:24 a.m.

New subject: [PATCH v2 11/11] packet: Upgrade severity of most packet errors

All errors from packet_range_check(), packet_add() and packet_get() are trace level. However, these are for the most part actual error conditions. They're states that should not happen, in many cases indicating a bug in the caller or elswhere. We don't promote these to err() or ASSERT() level, for fear of a localised bug on very specific input crashing the entire program, or flooding the logs, but we can at least upgrade them to debug level. Signed-off-by: David Gibson --- packet.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/packet.c b/packet.c index be28f279..72c61580 100644 --- a/packet.c +++ b/packet.c @@ -36,7 +36,7 @@ static int packet_check_range(const struct pool *p, const char *ptr, size_t len, const char *func, int line) { if (len > PACKET_MAX_LEN) { - trace("packet range length %zu (max %zu), %s:%i", + debug("packet range length %zu (max %zu), %s:%i", len, PACKET_MAX_LEN, func, line); return -1; } @@ -47,25 +47,25 @@ static int packet_check_range(const struct pool *p, const char *ptr, size_t len, ret = vu_packet_check_range((void *)p->buf, ptr, len); if (ret == -1) - trace("cannot find region, %s:%i", func, line); + debug("cannot find region, %s:%i", func, line); return ret; } if (ptr < p->buf) { - trace("packet range start %p before buffer start %p, %s:%i", + debug("packet range start %p before buffer start %p, %s:%i", (void *)ptr, (void *)p->buf, func, line); return -1; } if (len > p->buf_size) { - trace("packet range length %zu larger than buffer %zu, %s:%i", + debug("packet range length %zu larger than buffer %zu, %s:%i", len, p->buf_size, func, line); return -1; } if ((size_t)(ptr - p->buf) > p->buf_size - len) { - trace("packet range %p, len %zu after buffer end %p, %s:%i", + debug("packet range %p, len %zu after buffer end %p, %s:%i", (void *)ptr, len, (void *)(p->buf + p->buf_size), func, line); return -1; @@ -98,7 +98,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start, size_t idx = p->count; if (pool_full(p)) { - trace("add packet index %zu to pool with size %zu, %s:%i", + debug("add packet index %zu to pool with size %zu, %s:%i", idx, p->size, func, line); return; } @@ -134,7 +134,7 @@ void *packet_get_try_do(const struct pool *p, size_t idx, size_t offset, p->count, p->size, func, line); if (idx >= p->count) { - trace("packet %zu from pool count: %zu, %s:%i", + debug("packet %zu from pool count: %zu, %s:%i", idx, p->count, func, line); return NULL; } -- 2.48.1

Stefano Brivio

20 Mar 20 Mar

11:32 p.m.

On Mon, 17 Mar 2025 20:24:13 +1100 David Gibson wrote:

...

There are a number of places where we make calculations and checks around how large frames can be and where they sit in memory. Several of these are roughly correct, but can be wrong in certain edge cases. Improve robustness by clarifying what we're doing and being more careful about the edge cases.

v2: * Added additional patches 5..11 * Patches 1..4 rebased but unchanged

Applied... it took me a while to convince myself that the refactored checks in 2/11 and 10/11 (functionally different in one case) are in fact equivalent (and functionally equivalent in the bigger picture for that idx >= p->size now gone from 10/11), but yes, of course, they weren't robust earlier. -- Stefano

135

Age (days ago)

138

Last active (days ago)

List overview

Download

12 comments

2 participants

participants (2)

David Gibson
Stefano Brivio

[PATCH v2 00/11] Improve robustness of calculations related to frame size limits

tags

participants (2)