On Thu, Dec 19, 2024 at 10:00:15AM +0100, Stefano Brivio wrote:On Fri, 13 Dec 2024 23:01:56 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Yes. I'm certainly open to arguments on what the number should be.Currently we attempt to size pool_tap[46] so they have room for the maximum possible number of packets that could fit in pkt_buf, TAP_MSGS. However, the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as the minimum possible L2 frame size. But, we don't enforce that L2 frames are at least ETH_ZLEN when we receive them from the tap backend, and since we're dealing with virtual interfaces we don't have the physical Ethernet limitations requiring that length. Indeed it is possible to generate a legitimate frame smaller than that (e.g. a zero-payload UDP/IPv4 frame on the 'pasta' backend is only 42 bytes long). It's also unclear if this limit is sufficient for vhost-user which isn't limited by the size of pkt_buf as the other modes are. We could attempt to correct the calculation, but that would leave us with even larger arrays, which in practice rarely accumulate more than a handful of packets. So, instead, put an arbitrary cap on the number of packets we can put in a batch, and if we run out of space, process and flush the batch. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- packet.c | 13 ++++++++++++- packet.h | 3 +++ passt.h | 2 -- tap.c | 18 +++++++++++++++--- tap.h | 3 ++- vu_common.c | 3 ++- 6 files changed, 34 insertions(+), 8 deletions(-) diff --git a/packet.c b/packet.c index 5bfa7304..b68580cc 100644 --- a/packet.c +++ b/packet.c @@ -22,6 +22,17 @@ #include "util.h" #include "log.h" +/** + * pool_full() - Is a packet pool full? + * @p: Pointer to packet pool + * + * Return: true if the pool is full, false if more packets can be added + */ +bool pool_full(const struct pool *p) +{ + return p->count >= p->size; +} + /** * packet_add_do() - Add data as packet descriptor to given pool * @p: Existing pool @@ -35,7 +46,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start, { size_t idx = p->count; - if (idx >= p->size) { + if (pool_full(p)) { trace("add packet index %zu to pool with size %zu, %s:%i", idx, p->size, func, line); return; diff --git a/packet.h b/packet.h index 98eb8812..3618f213 100644 --- a/packet.h +++ b/packet.h @@ -6,6 +6,8 @@ #ifndef PACKET_H #define PACKET_H +#include <stdbool.h> + /** * struct pool - Generic pool of packets stored in nmemory * @size: Number of usable descriptors for the pool @@ -23,6 +25,7 @@ void packet_add_do(struct pool *p, size_t len, const char *start, void *packet_get_do(const struct pool *p, const size_t idx, size_t offset, size_t len, size_t *left, const char *func, int line); +bool pool_full(const struct pool *p); void pool_flush(struct pool *p); #define packet_add(p, len, start) \ diff --git a/passt.h b/passt.h index 0dd4efa0..81b2787f 100644 --- a/passt.h +++ b/passt.h @@ -70,8 +70,6 @@ static_assert(sizeof(union epoll_ref) <= sizeof(union epoll_data), #define TAP_BUF_BYTES \ ROUND_DOWN(((ETH_MAX_MTU + sizeof(uint32_t)) * 128), PAGE_SIZE) -#define TAP_MSGS \ - DIV_ROUND_UP(TAP_BUF_BYTES, ETH_ZLEN - 2 * ETH_ALEN + sizeof(uint32_t)) #define PKT_BUF_BYTES MAX(TAP_BUF_BYTES, 0) extern char pkt_buf [PKT_BUF_BYTES]; diff --git a/tap.c b/tap.c index 68231f09..42370a26 100644 --- a/tap.c +++ b/tap.c @@ -61,6 +61,8 @@ #include "vhost_user.h" #include "vu_common.h" +#define TAP_MSGS 256Sorry, I stopped at 2/3, had just a quick look at this one, and I missed this. Assuming 4 KiB pages, this changes from 161319 to 256. You mention thatin practice we never have more than a handful of messages, which is probably almost always the case, but I wonder if that's also the case with UDP "real-time" streams, where we could have bursts of a few hundred (thousand?) messages at a time.Maybe. If we are getting them in large bursts, then we're no longer really suceeding at the streams being "real-time", but sure, we should try to catch up as best we can.I wonder: how bad would it be to correct the calculation, instead? We wouldn't actually use more memory, right?I was pretty painful when I tried, and it would use more memory. The safe option would be to use ETH_HLEN as the minimum size (which is pretty much all we enforce in the tap layer), which would expand the iovec array here by 2-3x. It's not enormous, but it's not nothing. Or do you mean the unused pages of the array would never be instantiated? In which case, yeah, I guess not. Remember that with the changes in this patch if we exceed TAP_MSGS, nothing particularly bad happens: we don't crash, and we don't drop packets; we just process things in batches of TAP_MSGS frames at a time. So this doesn't need to be large enough to handle any burst we could ever get, just large enough to adequately mitigate the per-batch costs, which I don't think are _that_ large. 256 was a first guess at that. Maybe it's not enough, but I'd be pretty surprised if it needed to be greater than ~1000 to make the per-batch costs negligible compared to the per-frame costs. UDP_MAX_FRAMES, which is on the reverse path but serves a similar function, is only 32. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson