On Thu, Nov 17, 2022 at 04:49:31PM +0100, Stefano Brivio wrote:On Thu, 17 Nov 2022 15:33:06 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:With EPOLLRDHUP removed it's a bit deadlock-y. One case I commonly see is: child (qemu): 1295637 write(5, "\0\0\0\0", 4 <unfinished ...> 1295637 <... write resumed>) = 4 1295637 poll([{fd=5, events=POLLIN|POLLOUT}], 1, -1 <unfinished ...> 1295637 <... poll resumed>) = 1 ([{fd=5, revents=POLLOUT}]) 1295637 read(3, <unfinished ...> 1295637 <... read resumed>"", 512) = 0 1295637 shutdown(5, SHUT_WR <unfinished ...> 1295637 <... shutdown resumed>) = 0 1295637 poll([{fd=5, events=POLLIN}], 1, -1 <unfinished ...> then later the parent (passt): 1295636 epoll_wait(5, 0x7ffe93a894e0, 8, 1000) = 1 1295636 recvfrom(4, 0x5576383cf000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 (forever) Removing EPOLLET (edge triggered) delivers the EPOLLIN event over and over again to passt as expected: 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 (forever) but I expected that passt would exit the first time it reads 0 from the socket. tap_handler_passt() seems like it only considers the n<0 (error) and n>0 (data) cases. What do you think about checking for n == 0 and exiting immediately if c->one_off is true? This seems to only happen under strace. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.orgOn Thu, Nov 17, 2022 at 04:26:40PM +0100, Stefano Brivio wrote:Out-of-band, so to speak: we won't even recv() if we get EPOLLRDHUP (that's handled in tap_handler()). If I do this on top of this patch: --- a/tap.c +++ b/tap.c @@ -1073,7 +1073,7 @@ void tap_sock_init(struct ctx *c) struct epoll_event ev = { 0 }; ev.data.fd = c->fd_tap; - ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP; + ev.events = EPOLLIN | EPOLLET; Then it gets those four bytes: [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = 1 [pid 2538704] recvfrom(4, 0x560797677000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4 [pid 2538704] epoll_wait(5, [], 8, 1000) = 0 [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = -1 EINTR (Interrupted system call) and does nothing with them, as expected. Two epoll_wait() calls later, the syscall is interrupted, I'm not sure why and how we should react (in main(), passt.c) in that case.From: "Richard W.M. Jones" <rjones(a)redhat.com> This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- v2: - reuse fd_tap, we don't need a separate file descriptor - add F to optstring and usage(), for both passt and pasta - imply --one-off, we can't do much once the socket is closed With this, the trick from 5/5 goes a bit further: passt reads from the file descriptor passed by the wrapper.Thanks for the v2 .. I'll add it to my series and play with it.However, we get EPOLLRDHUP right away, from the close() on one end of the socket pair I guess. Should we just ignore all EPOLLRDHUP events, just the first one...?Does it see the event out-of-band or does it get an in-band read(2) == 0 after finishing reading the data?