On Fri, Jul 26, 2024 at 03:25:38PM +0200, Stefano Brivio wrote:On Fri, 26 Jul 2024 22:12:27 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:I think we need to understand if and why that's still the case before putting this back in. I can see an obvious reason why the loop might reduce overhead, but not why the EPOLLET flag itself would. If anything I'd expect level triggered events to more accurately give us wakeups only exactly when we need them. Note also that even looping withouth EPOLLET does have its own cost here: it potentially allows a heavy burst of traffic from qemu to starve processing of events on other sockets. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibsonOn Fri, Jul 26, 2024 at 10:00:56AM +0200, Stefano Brivio wrote:True, we could actually have that loop back without EPOLLET. But the reason why I added EPOLLET, despite the resulting complexity, was surely increased overhead without it. I can't remember (and unfortunately I didn't write this in that commit message from 2021) exactly how that looked like, if we had spurious or more frequent wake-ups or what else. Maybe that was a side effect of something that's fixed or otherwise changed now, but still we should give this a pass with perf(1) before we try to optimise this again (if it even needs to be optimised, that is).On Fri, 26 Jul 2024 17:20:29 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:That's a reason to keep the loop, but not EPOLLET itself, AFAICT. I'd be happy enough to put the loop back in as an optimization (although, I'd prefer to avoid the goto).Currently we set EPOLLET (edge trigger) on the epoll flags for the connected Qemu Unix socket. It's not clear that there's a reason for doing this: for TCP sockets we need to use EPOLLET, because we leave data in the socket buffers for our flow control handling. That consideration doesn't apply to the way we handle the qemu socket however.It significantly decreases epoll_wait() overhead on sustained data transfers, because we can read multiple TAP_BUF_SIZE buffers at a time instead of just one.