[PATCH 0/4] Improve robustness of calculations related to frame size limits
There are a number of places where we make calculations and checks around how large frames can be and where they sit in memory. Several of these are roughly correct, but can be wrong in certain edge cases. Improve robustness by clarifying what we're doing and being more careful about the edge cases. David Gibson (4): vu_common: Tighten vu_packet_check_range() packet: More cautious checks to avoid pointer arithmetic UB tap: Make size of pool_tap[46] purely a tuning parameter tap: Clarify calculation of TAP_MSGS packet.c | 25 +++++++++++++++++++++---- packet.h | 3 +++ passt.h | 2 -- tap.c | 43 ++++++++++++++++++++++++++++++++++++------- tap.h | 3 ++- vu_common.c | 15 ++++++++++----- 6 files changed, 72 insertions(+), 19 deletions(-) -- 2.48.1
This function verifies that the given packet is within the mmap()ed memory
region of the vhost-user device. We can do better, however. The packet
should be not only within the mmap()ed range, but specifically in the
subsection of that range set aside for shared buffers, which starts at
dev_region->mmap_offset within there.
Signed-off-by: David Gibson
packet_check_range and vu_packet_check_range() verify that the packet or
section of packet we're interested in lies in the packet buffer pool we
expect it to. However, in doing so it doesn't avoid the possibility of
an integer overflow while performing pointer arithmetic, with is UB. In
fact, AFAICT it's UB even to use arbitrary pointer arithmetic to construct
a pointer outside of a known valid buffer.
To do this safely, we can't calculate the end of a memory region with
pointer addition when then the length as untrusted. Instead we must work
out the offset of one memory region within another using pointer
subtraction, then do integer checks against the length of the outer region.
We then need to be careful about the order of checks so that those integer
checks can't themselves overflow.
Signed-off-by: David Gibson
Currently we attempt to size pool_tap[46] so they have room for the maximum
possible number of packets that could fit in pkt_buf (TAP_MSGS). However,
the calculation isn't quite correct: TAP_MSGS is based on ETH_ZLEN (60) as
the minimum possible L2 frame size. But ETH_ZLEN is based on physical
constraints of Ethernet, which don't apply to our virtual devices. It is
possible to generate a legitimate frame smaller than this, for example an
empty payload UDP/IPv4 frame on the 'pasta' backend is only 42 bytes long.
Further more, the same limit applies for vhost-user, which is not limited
by the size of pkt_buf like the other backends. In that case we don't even
have full control of the maximum buffer size, so we can't really calculate
how many packets could fit in there.
If we exceed do TAP_MSGS we'll drop packets, not just use more batches,
which is moderately bad. The fact that this needs to be sized just so for
correctness not merely for tuning is a fairly non-obvious coupling between
different parts of the code.
To make this more robust, alter the tap code so it doesn't rely on
everything fitting in a single batch of TAP_MSGS packets, instead breaking
into multiple batches as necessary. This leaves TAP_MSGS as purely a
tuning parameter, which we can freely adjust based on performance measures.
Signed-off-by: David Gibson
The rationale behind the calculation of TAP_MSGS isn't necessarily obvious.
It's supposed to be the maximum number of packets that can fit in pkt_buf.
However, the calculation is wrong in several ways:
* It's based on ETH_ZLEN which isn't meaningful for virtual devices
* It always includes the qemu socket header which isn't used for pasta
* The size of pkt_buf isn't relevant for vhost-user
We've already made sure this is just a tuning parameter, not a hard limit.
Clarify what we're calculating here and why.
Signed-off-by: David Gibson
participants (1)
-
David Gibson