On Thu, Jun 27, 2024 at 01:45:36AM +0200, Stefano Brivio wrote:
This was reported by Matej a while ago, but we forgot to fix it. Even if the hypervisor is necessarily trusted by passt, as it can in any case terminate the guest or disrupt guest connectivity, it's a good idea to be robust against possible issues.
Instead of resetting the connection to the hypervisor, just skip the offending frame, as we had a few cases where QEMU would get the length descriptor wrong, in the past.
Reported-by: Matej Hrica
Suggested-by: Matej Hrica Signed-off-by: Stefano Brivio --- tap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tap.c b/tap.c index 7f8c26d..bb993e0 100644 --- a/tap.c +++ b/tap.c @@ -1026,6 +1026,9 @@ redo:
So.. I just realised there's a different pre-existing problem here, above what's quoted in the patch we have: ssize_t l2len = ntohl(*(uint32_t *)p); On a platform where ssize_t is only 32-bits we could get a negative value here, which would be very bad. So l2len should be a uint32_t, not ssize_t. We do then need to make sure that the comparison between l2len and n is unsigned - it's safe to cast n to size_t there, because we've verified it's positive as the loop condition. Or... maybe it's simpler. The frame length is encoded as 32-bits, but we can't meaningfully have frames above 64k (maybe 64k+ETH_HLEN). So possibly we should just reset the tap connection if we see such a frame (most likely it means we've somehow gotten out of sync, anyway).
p += sizeof(uint32_t); n -= sizeof(uint32_t);
+ if (l2len > (ssize_t)TAP_BUF_BYTES - n)
I hate to discard valid frames from the guest.
+ goto next;
..and this is not safe. This skips (l2len > n) check, which means that the n -= l2len at next could now have a signed overflow, which is UB.
+ /* At most one packet might not fit in a single read, and this * needs to be blocking. */
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson