(Note this patch series does not work so far and needs some help, read on ...) Patches 1 & 2 are general cleanup. The rest of the patches aim to add fuzzing support for Passt using AFL++, Clang and ASAN. I used the same approach as with libnbd: https://gitlab.com/nbdkit/libnbd/-/tree/master/fuzzing Firstly (patch 3) I added an --fd option. This is useful for fuzzing, but also generally useful. It allows a controller process to open a connected stream socket and pass that down to passt via inheritance. Uses outside fuzzing include: having the controlling process open the socket with elevated privleges, and allowing alternate address families to be used (eg. vsock or IB). Unfortunately I don't think the --fd option is working. stracing the code shows the socket being added to the epoll, but it somehow never gets read. It might be something obvious but I couldn't see what was wrong. (NB: The socket passed in is *connected* already). Patch 4 adds the fuzzing wrapper. The purpose of the wrapper is to allow AFL to submit test cases to passt as local files. It works by creating a socketpair(2), forking and execing passt in the parent: (parent) passt -f -e -1 --fd <sock> | ^ | | (child) | input file V /dev/null The child reads the input file (test case) from the command line and pushes it into the socket, while discarding anything written by passt. IOW the child takes the place of qemu. With all patches applied you can test the wrapper alone using: $ ./fuzz-wrapper testcase_dir/empty_tap You will see that it currently hangs which it should not do, and I suspect the problem is related to the implementation of --fd as the wrapper is old and well-tested code. Rich.
If you run the build several times it will fail unnecessarily with: ln -s passt pasta ln: failed to create symbolic link 'pasta': File exists make: *** [Makefile:134: pasta] Error 1 Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 6b22408..c13a818 100644 --- a/Makefile +++ b/Makefile @@ -131,7 +131,7 @@ passt.avx2: $(PASST_SRCS) $(HEADERS) passt.avx2: passt pasta.avx2 pasta.1 pasta: pasta%: passt% - ln -s $< $@ + ln -sf $< $@ qrap: $(QRAP_SRCS) passt.h $(CC) $(FLAGS) $(CFLAGS) $(QRAP_SRCS) -o qrap $(LDFLAGS) -- 2.37.0.rc2
These files are left around by emacs amongst other editors. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index c13a818..e370f0a 100644 --- a/Makefile +++ b/Makefile @@ -144,7 +144,7 @@ valgrind: all .PHONY: clean clean: - $(RM) $(BIN) *.o seccomp.h pasta.1 \ + $(RM) $(BIN) *~ *.o seccomp.h pasta.1 \ passt.tar passt.tar.gz *.deb *.rpm \ passt.pid README.plain.md -- 2.37.0.rc2
This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- conf.c | 15 +++++++++++++++ passt.1 | 10 ++++++++++ passt.c | 2 +- passt.h | 2 ++ tap.c | 22 +++++++++++++++++++++- 5 files changed, 49 insertions(+), 2 deletions(-) diff --git a/conf.c b/conf.c index 1adcf83..5e08bd6 100644 --- a/conf.c +++ b/conf.c @@ -1089,6 +1089,7 @@ void conf(struct ctx *c, int argc, char **argv) {"log-file", required_argument, NULL, 'l' }, {"help", no_argument, NULL, 'h' }, {"socket", required_argument, NULL, 's' }, + {"fd", required_argument, NULL, 'F' }, {"ns-ifname", required_argument, NULL, 'I' }, {"pcap", required_argument, NULL, 'p' }, {"pid", required_argument, NULL, 'P' }, @@ -1366,6 +1367,15 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } break; + case 'F': + if (c->sock_fd >= 0) { + err("Multiple --fd options given"); + usage(argv[0]); + } + + errno = 0; + c->sock_fd = strtol(optarg, NULL, 0); + break; case 'I': if (*c->pasta_ifn) { err("Multiple --ns-ifname options given"); @@ -1600,6 +1610,11 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } + if (*c->sock_path && c->sock_fd >= 0) { + err("Options --socket and --fd are mutually exclusive"); + usage(argv[0]); + } + ret = conf_ugid(runas, &uid, &gid); if (ret) usage(argv[0]); diff --git a/passt.1 b/passt.1 index e34a3e0..66d5eb6 100644 --- a/passt.1 +++ b/passt.1 @@ -297,6 +297,16 @@ Path for UNIX domain socket used by \fBqemu\fR(1) or \fBqrap\fR(1) to connect to Default is to probe a free socket, not accepting connections, starting from \fI/tmp/passt_1.socket\fR to \fI/tmp/passt_64.socket\fR. +.TP +.BR \-F ", " \-\-fd " " \fIFD +Pass a pre-opened, connected socket to +\fBpasst\fR. +Usually the socket is opened in the parent process and +\fBpasst\fR +inherits it when run as a child. This allows the parent process to +open sockets using another address family or requiring special +privileges. + .TP .BR \-1 ", " \-\-one-off Quit after handling a single client connection, that is, once the client closes diff --git a/passt.c b/passt.c index 7d323c2..ef4643e 100644 --- a/passt.c +++ b/passt.c @@ -187,7 +187,7 @@ int main(int argc, char **argv) isolate_initial(); - c.pasta_netns_fd = c.fd_tap = c.fd_tap_listen = -1; + c.pasta_netns_fd = c.fd_tap = c.fd_tap_listen = c.sock_fd = -1; sigemptyset(&sa.sa_mask); sa.sa_flags = 0; diff --git a/passt.h b/passt.h index e93eea8..32fc4ae 100644 --- a/passt.h +++ b/passt.h @@ -149,6 +149,7 @@ struct ip6_ctx { * @stderr: Force logging to stderr * @nofile: Maximum number of open files (ulimit -n) * @sock_path: Path for UNIX domain socket + * @sock_fd: Connected UNIX domain socket * @pcap: Path for packet capture file * @pid_file: Path to PID file, empty string if not configured * @pasta_netns_fd: File descriptor for network namespace in pasta mode @@ -198,6 +199,7 @@ struct ctx { int stderr; int nofile; char sock_path[UNIX_PATH_MAX]; + int sock_fd; char pcap[PATH_MAX]; char pid_file[PATH_MAX]; int one_off; diff --git a/tap.c b/tap.c index abeff25..418a788 100644 --- a/tap.c +++ b/tap.c @@ -1063,6 +1063,24 @@ static void tap_sock_tun_init(struct ctx *c) epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); } +/** + * tap_sock_fd_init() - Set up a connected socket (c->sock_fd) + * @c: Execution context + */ +void tap_sock_fd_init(struct ctx *c) +{ + struct epoll_event ev = { 0 }; + + c->fd_tap = c->sock_fd; + c->sock_fd = -1; + + info("Setting up fd %d as tap socket", c->fd_tap); + + ev.data.fd = c->fd_tap; + ev.events = EPOLLIN | EPOLLRDHUP; + epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); +} + /** * tap_sock_init() - Create and set up AF_UNIX socket or tuntap file descriptor * @c: Execution context @@ -1087,7 +1105,9 @@ void tap_sock_init(struct ctx *c) } if (c->mode == MODE_PASST) { - if (c->fd_tap_listen == -1) + if (c->sock_fd >= 0) + tap_sock_fd_init(c); + else if (c->fd_tap_listen == -1) tap_sock_unix_init(c); } else { tap_sock_tun_init(c); -- 2.37.0.rc2
From: "Richard W.M. Jones" <rjones(a)redhat.com> This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- v2: - reuse fd_tap, we don't need a separate file descriptor - add F to optstring and usage(), for both passt and pasta - imply --one-off, we can't do much once the socket is closed With this, the trick from 5/5 goes a bit further: passt reads from the file descriptor passed by the wrapper. However, we get EPOLLRDHUP right away, from the close() on one end of the socket pair I guess. Should we just ignore all EPOLLRDHUP events, just the first one...? conf.c | 28 ++++++++++++++++++++++++++-- passt.1 | 10 ++++++++++ passt.c | 1 - passt.h | 2 +- tap.c | 9 +++++++++ 5 files changed, 46 insertions(+), 4 deletions(-) diff --git a/conf.c b/conf.c index a995eb7..2c374bd 100644 --- a/conf.c +++ b/conf.c @@ -719,6 +719,7 @@ static void usage(const char *name) UNIX_SOCK_PATH, 1); } + info( " -F, --fd FD Use FD as pre-opened connected socket"); info( " -p, --pcap FILE Log tap-facing traffic to pcap file"); info( " -P, --pid FILE Write own PID to the given file"); info( " -m, --mtu MTU Assign MTU via DHCP/NDP"); @@ -1079,6 +1080,7 @@ void conf(struct ctx *c, int argc, char **argv) {"log-file", required_argument, NULL, 'l' }, {"help", no_argument, NULL, 'h' }, {"socket", required_argument, NULL, 's' }, + {"fd", required_argument, NULL, 'F' }, {"ns-ifname", required_argument, NULL, 'I' }, {"pcap", required_argument, NULL, 'p' }, {"pid", required_argument, NULL, 'P' }, @@ -1138,9 +1140,9 @@ void conf(struct ctx *c, int argc, char **argv) if (c->mode == MODE_PASTA) { c->no_dhcp_dns = c->no_dhcp_dns_search = 1; - optstring = "dqfel:hI:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; + optstring = "dqfel:hf:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; } else { - optstring = "dqfel:hs:p:P:m:a:n:M:g:i:D:S:461t:u:"; + optstring = "dqfel:hs:f:p:P:m:a:n:M:g:i:D:S:461t:u:"; } c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0; @@ -1355,6 +1357,23 @@ void conf(struct ctx *c, int argc, char **argv) err("Invalid socket path: %s", optarg); usage(argv[0]); } + break; + case 'F': + if (c->fd_tap >= 0) { + err("Multiple --fd options given"); + usage(argv[0]); + } + + errno = 0; + c->fd_tap = strtol(optarg, NULL, 0); + + if (c->fd_tap < 0 || errno) { + err("Invalid --fd: %s", optarg); + usage(argv[0]); + } + + c->one_off = true; + break; case 'I': if (*c->pasta_ifn) { @@ -1590,6 +1609,11 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } + if (*c->sock_path && c->fd_tap >= 0) { + err("Options --socket and --fd are mutually exclusive"); + usage(argv[0]); + } + ret = conf_ugid(runas, &uid, &gid); if (ret) usage(argv[0]); diff --git a/passt.1 b/passt.1 index e34a3e0..528763b 100644 --- a/passt.1 +++ b/passt.1 @@ -297,6 +297,16 @@ Path for UNIX domain socket used by \fBqemu\fR(1) or \fBqrap\fR(1) to connect to Default is to probe a free socket, not accepting connections, starting from \fI/tmp/passt_1.socket\fR to \fI/tmp/passt_64.socket\fR. +.TP +.BR \-F ", " \-\-fd " " \fIFD +Pass a pre-opened, connected socket to \fBpasst\fR. Usually the socket is opened +in the parent process and \fBpasst\fR inherits it when run as a child. This +allows the parent process to open sockets using another address family or +requiring special privileges. + +This option implies the behaviour described for \-\-one-off, once this socket +is closed. + .TP .BR \-1 ", " \-\-one-off Quit after handling a single client connection, that is, once the client closes diff --git a/passt.c b/passt.c index 7d323c2..8b2c50d 100644 --- a/passt.c +++ b/passt.c @@ -255,7 +255,6 @@ int main(int argc, char **argv) quit_fd = pasta_netns_quit_init(&c); - c.fd_tap = c.fd_tap_listen = -1; tap_sock_init(&c); clock_gettime(CLOCK_MONOTONIC, &now); diff --git a/passt.h b/passt.h index 6649c0a..ca25b90 100644 --- a/passt.h +++ b/passt.h @@ -159,7 +159,7 @@ struct ip6_ctx { * @proc_net_udp: Stored handles for /proc/net/udp{,6} in init and ns * @epollfd: File descriptor for epoll instance * @fd_tap_listen: File descriptor for listening AF_UNIX socket, if any - * @fd_tap: File descriptor for AF_UNIX socket or tuntap device + * @fd_tap: AF_UNIX socket, tuntap device, or pre-opened socket * @mac: Host MAC address * @mac_guest: MAC address of guest or namespace, seen or configured * @ifi4: Index of routable interface for IPv4, 0 if IPv4 disabled diff --git a/tap.c b/tap.c index d26af58..9998127 100644 --- a/tap.c +++ b/tap.c @@ -1069,6 +1069,15 @@ void tap_sock_init(struct ctx *c) } if (c->fd_tap != -1) { + if (c->one_off) { /* Passed as --fd */ + struct epoll_event ev = { 0 }; + + ev.data.fd = c->fd_tap; + ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP; + epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); + return; + } + epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_tap, NULL); close(c->fd_tap); c->fd_tap = -1; -- 2.35.1
On Thu, 17 Nov 2022 16:26:40 +0100 Stefano Brivio <sbrivio(a)redhat.com> wrote:@@ -1138,9 +1140,9 @@ void conf(struct ctx *c, int argc, char **argv) if (c->mode == MODE_PASTA) { c->no_dhcp_dns = c->no_dhcp_dns_search = 1; - optstring = "dqfel:hI:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; + optstring = "dqfel:hf:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; } else { - optstring = "dqfel:hs:p:P:m:a:n:M:g:i:D:S:461t:u:"; + optstring = "dqfel:hs:f:p:P:m:a:n:M:g:i:D:S:461t:u:";Should be 'F', really. -- Stefano
On Thu, Nov 17, 2022 at 04:31:05PM +0100, Stefano Brivio wrote:On Thu, 17 Nov 2022 16:26:40 +0100 Stefano Brivio <sbrivio(a)redhat.com> wrote:OK, will update my copy too. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top@@ -1138,9 +1140,9 @@ void conf(struct ctx *c, int argc, char **argv) if (c->mode == MODE_PASTA) { c->no_dhcp_dns = c->no_dhcp_dns_search = 1; - optstring = "dqfel:hI:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; + optstring = "dqfel:hf:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; } else { - optstring = "dqfel:hs:p:P:m:a:n:M:g:i:D:S:461t:u:"; + optstring = "dqfel:hs:f:p:P:m:a:n:M:g:i:D:S:461t:u:";Should be 'F', really.
On Thu, Nov 17, 2022 at 04:26:40PM +0100, Stefano Brivio wrote:From: "Richard W.M. Jones" <rjones(a)redhat.com> This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- v2: - reuse fd_tap, we don't need a separate file descriptor - add F to optstring and usage(), for both passt and pasta - imply --one-off, we can't do much once the socket is closed With this, the trick from 5/5 goes a bit further: passt reads from the file descriptor passed by the wrapper.Thanks for the v2 .. I'll add it to my series and play with it.However, we get EPOLLRDHUP right away, from the close() on one end of the socket pair I guess. Should we just ignore all EPOLLRDHUP events, just the first one...?Does it see the event out-of-band or does it get an in-band read(2) == 0 after finishing reading the data? Ideally what should happen is that passt reads and parses all or as much of the data as it can, and then it responds to the closed socket by exiting. The reason for this is so that we have as much opportunity as possible to break the parser and crash passt. If passt was exiting before fully reading/parsing everything that it could, then we wouldn't be fuzzing the parser as deeply as we can. Note that the fuzzer will (quite routinely and normally) create test cases which are total nonsense, eg. packets with incorrect length field, truncated final packet, etc. Rich.conf.c | 28 ++++++++++++++++++++++++++-- passt.1 | 10 ++++++++++ passt.c | 1 - passt.h | 2 +- tap.c | 9 +++++++++ 5 files changed, 46 insertions(+), 4 deletions(-) diff --git a/conf.c b/conf.c index a995eb7..2c374bd 100644 --- a/conf.c +++ b/conf.c @@ -719,6 +719,7 @@ static void usage(const char *name) UNIX_SOCK_PATH, 1); } + info( " -F, --fd FD Use FD as pre-opened connected socket"); info( " -p, --pcap FILE Log tap-facing traffic to pcap file"); info( " -P, --pid FILE Write own PID to the given file"); info( " -m, --mtu MTU Assign MTU via DHCP/NDP"); @@ -1079,6 +1080,7 @@ void conf(struct ctx *c, int argc, char **argv) {"log-file", required_argument, NULL, 'l' }, {"help", no_argument, NULL, 'h' }, {"socket", required_argument, NULL, 's' }, + {"fd", required_argument, NULL, 'F' }, {"ns-ifname", required_argument, NULL, 'I' }, {"pcap", required_argument, NULL, 'p' }, {"pid", required_argument, NULL, 'P' }, @@ -1138,9 +1140,9 @@ void conf(struct ctx *c, int argc, char **argv) if (c->mode == MODE_PASTA) { c->no_dhcp_dns = c->no_dhcp_dns_search = 1; - optstring = "dqfel:hI:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; + optstring = "dqfel:hf:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; } else { - optstring = "dqfel:hs:p:P:m:a:n:M:g:i:D:S:461t:u:"; + optstring = "dqfel:hs:f:p:P:m:a:n:M:g:i:D:S:461t:u:"; } c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0; @@ -1355,6 +1357,23 @@ void conf(struct ctx *c, int argc, char **argv) err("Invalid socket path: %s", optarg); usage(argv[0]); } + break; + case 'F': + if (c->fd_tap >= 0) { + err("Multiple --fd options given"); + usage(argv[0]); + } + + errno = 0; + c->fd_tap = strtol(optarg, NULL, 0); + + if (c->fd_tap < 0 || errno) { + err("Invalid --fd: %s", optarg); + usage(argv[0]); + } + + c->one_off = true; + break; case 'I': if (*c->pasta_ifn) { @@ -1590,6 +1609,11 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } + if (*c->sock_path && c->fd_tap >= 0) { + err("Options --socket and --fd are mutually exclusive"); + usage(argv[0]); + } + ret = conf_ugid(runas, &uid, &gid); if (ret) usage(argv[0]); diff --git a/passt.1 b/passt.1 index e34a3e0..528763b 100644 --- a/passt.1 +++ b/passt.1 @@ -297,6 +297,16 @@ Path for UNIX domain socket used by \fBqemu\fR(1) or \fBqrap\fR(1) to connect to Default is to probe a free socket, not accepting connections, starting from \fI/tmp/passt_1.socket\fR to \fI/tmp/passt_64.socket\fR. +.TP +.BR \-F ", " \-\-fd " " \fIFD +Pass a pre-opened, connected socket to \fBpasst\fR. Usually the socket is opened +in the parent process and \fBpasst\fR inherits it when run as a child. This +allows the parent process to open sockets using another address family or +requiring special privileges. + +This option implies the behaviour described for \-\-one-off, once this socket +is closed. + .TP .BR \-1 ", " \-\-one-off Quit after handling a single client connection, that is, once the client closes diff --git a/passt.c b/passt.c index 7d323c2..8b2c50d 100644 --- a/passt.c +++ b/passt.c @@ -255,7 +255,6 @@ int main(int argc, char **argv) quit_fd = pasta_netns_quit_init(&c); - c.fd_tap = c.fd_tap_listen = -1; tap_sock_init(&c); clock_gettime(CLOCK_MONOTONIC, &now); diff --git a/passt.h b/passt.h index 6649c0a..ca25b90 100644 --- a/passt.h +++ b/passt.h @@ -159,7 +159,7 @@ struct ip6_ctx { * @proc_net_udp: Stored handles for /proc/net/udp{,6} in init and ns * @epollfd: File descriptor for epoll instance * @fd_tap_listen: File descriptor for listening AF_UNIX socket, if any - * @fd_tap: File descriptor for AF_UNIX socket or tuntap device + * @fd_tap: AF_UNIX socket, tuntap device, or pre-opened socket * @mac: Host MAC address * @mac_guest: MAC address of guest or namespace, seen or configured * @ifi4: Index of routable interface for IPv4, 0 if IPv4 disabled diff --git a/tap.c b/tap.c index d26af58..9998127 100644 --- a/tap.c +++ b/tap.c @@ -1069,6 +1069,15 @@ void tap_sock_init(struct ctx *c) } if (c->fd_tap != -1) { + if (c->one_off) { /* Passed as --fd */ + struct epoll_event ev = { 0 }; + + ev.data.fd = c->fd_tap; + ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP; + epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); + return; + } + epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_tap, NULL); close(c->fd_tap); c->fd_tap = -1; -- 2.35.1-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
On Thu, 17 Nov 2022 15:33:06 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:On Thu, Nov 17, 2022 at 04:26:40PM +0100, Stefano Brivio wrote:Out-of-band, so to speak: we won't even recv() if we get EPOLLRDHUP (that's handled in tap_handler()). If I do this on top of this patch: --- a/tap.c +++ b/tap.c @@ -1073,7 +1073,7 @@ void tap_sock_init(struct ctx *c) struct epoll_event ev = { 0 }; ev.data.fd = c->fd_tap; - ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP; + ev.events = EPOLLIN | EPOLLET; Then it gets those four bytes: [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = 1 [pid 2538704] recvfrom(4, 0x560797677000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4 [pid 2538704] epoll_wait(5, [], 8, 1000) = 0 [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = -1 EINTR (Interrupted system call) and does nothing with them, as expected. Two epoll_wait() calls later, the syscall is interrupted, I'm not sure why and how we should react (in main(), passt.c) in that case.From: "Richard W.M. Jones" <rjones(a)redhat.com> This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- v2: - reuse fd_tap, we don't need a separate file descriptor - add F to optstring and usage(), for both passt and pasta - imply --one-off, we can't do much once the socket is closed With this, the trick from 5/5 goes a bit further: passt reads from the file descriptor passed by the wrapper.Thanks for the v2 .. I'll add it to my series and play with it.However, we get EPOLLRDHUP right away, from the close() on one end of the socket pair I guess. Should we just ignore all EPOLLRDHUP events, just the first one...?Does it see the event out-of-band or does it get an in-band read(2) == 0 after finishing reading the data?Ideally what should happen is that passt reads and parses all or as much of the data as it can, and then it responds to the closed socket by exiting. The reason for this is so that we have as much opportunity as possible to break the parser and crash passt.So probably ignoring EPOLLRDHUP is safe. If the socket is closed we should get EPOLLHUP.If passt was exiting before fully reading/parsing everything that it could, then we wouldn't be fuzzing the parser as deeply as we can.We would process a EPOLLRDHUP or EPOLLHUP before doing anything, yes (also in tap_handler()). But if we get a EPOLLIN event first, then it's fine, we'll read everything.Note that the fuzzer will (quite routinely and normally) create test cases which are total nonsense, eg. packets with incorrect length field, truncated final packet, etc.Yes yes, I can imagine. :) -- Stefano
On Thu, Nov 17, 2022 at 04:49:31PM +0100, Stefano Brivio wrote:On Thu, 17 Nov 2022 15:33:06 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:With EPOLLRDHUP removed it's a bit deadlock-y. One case I commonly see is: child (qemu): 1295637 write(5, "\0\0\0\0", 4 <unfinished ...> 1295637 <... write resumed>) = 4 1295637 poll([{fd=5, events=POLLIN|POLLOUT}], 1, -1 <unfinished ...> 1295637 <... poll resumed>) = 1 ([{fd=5, revents=POLLOUT}]) 1295637 read(3, <unfinished ...> 1295637 <... read resumed>"", 512) = 0 1295637 shutdown(5, SHUT_WR <unfinished ...> 1295637 <... shutdown resumed>) = 0 1295637 poll([{fd=5, events=POLLIN}], 1, -1 <unfinished ...> then later the parent (passt): 1295636 epoll_wait(5, 0x7ffe93a894e0, 8, 1000) = 1 1295636 recvfrom(4, 0x5576383cf000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 (forever) Removing EPOLLET (edge triggered) delivers the EPOLLIN event over and over again to passt as expected: 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 (forever) but I expected that passt would exit the first time it reads 0 from the socket. tap_handler_passt() seems like it only considers the n<0 (error) and n>0 (data) cases. What do you think about checking for n == 0 and exiting immediately if c->one_off is true? This seems to only happen under strace. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.orgOn Thu, Nov 17, 2022 at 04:26:40PM +0100, Stefano Brivio wrote:Out-of-band, so to speak: we won't even recv() if we get EPOLLRDHUP (that's handled in tap_handler()). If I do this on top of this patch: --- a/tap.c +++ b/tap.c @@ -1073,7 +1073,7 @@ void tap_sock_init(struct ctx *c) struct epoll_event ev = { 0 }; ev.data.fd = c->fd_tap; - ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP; + ev.events = EPOLLIN | EPOLLET; Then it gets those four bytes: [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = 1 [pid 2538704] recvfrom(4, 0x560797677000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4 [pid 2538704] epoll_wait(5, [], 8, 1000) = 0 [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = -1 EINTR (Interrupted system call) and does nothing with them, as expected. Two epoll_wait() calls later, the syscall is interrupted, I'm not sure why and how we should react (in main(), passt.c) in that case.From: "Richard W.M. Jones" <rjones(a)redhat.com> This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- v2: - reuse fd_tap, we don't need a separate file descriptor - add F to optstring and usage(), for both passt and pasta - imply --one-off, we can't do much once the socket is closed With this, the trick from 5/5 goes a bit further: passt reads from the file descriptor passed by the wrapper.Thanks for the v2 .. I'll add it to my series and play with it.However, we get EPOLLRDHUP right away, from the close() on one end of the socket pair I guess. Should we just ignore all EPOLLRDHUP events, just the first one...?Does it see the event out-of-band or does it get an in-band read(2) == 0 after finishing reading the data?
On Thu, 17 Nov 2022 16:02:43 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:On Thu, Nov 17, 2022 at 04:49:31PM +0100, Stefano Brivio wrote:Ah, yes, we ignore EPOLLRDHUP, and there's a one-second timeout there, so it's actually expected.On Thu, 17 Nov 2022 15:33:06 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:With EPOLLRDHUP removed it's a bit deadlock-y. One case I commonly see is: child (qemu): 1295637 write(5, "\0\0\0\0", 4 <unfinished ...> 1295637 <... write resumed>) = 4 1295637 poll([{fd=5, events=POLLIN|POLLOUT}], 1, -1 <unfinished ...> 1295637 <... poll resumed>) = 1 ([{fd=5, revents=POLLOUT}]) 1295637 read(3, <unfinished ...> 1295637 <... read resumed>"", 512) = 0 1295637 shutdown(5, SHUT_WR <unfinished ...> 1295637 <... shutdown resumed>) = 0 1295637 poll([{fd=5, events=POLLIN}], 1, -1 <unfinished ...> then later the parent (passt): 1295636 epoll_wait(5, 0x7ffe93a894e0, 8, 1000) = 1 1295636 recvfrom(4, 0x5576383cf000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 1295636 epoll_wait(5, [], 8, 1000) = 0 (forever)On Thu, Nov 17, 2022 at 04:26:40PM +0100, Stefano Brivio wrote:Out-of-band, so to speak: we won't even recv() if we get EPOLLRDHUP (that's handled in tap_handler()). If I do this on top of this patch: --- a/tap.c +++ b/tap.c @@ -1073,7 +1073,7 @@ void tap_sock_init(struct ctx *c) struct epoll_event ev = { 0 }; ev.data.fd = c->fd_tap; - ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP; + ev.events = EPOLLIN | EPOLLET; Then it gets those four bytes: [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = 1 [pid 2538704] recvfrom(4, 0x560797677000, 8323069, MSG_DONTWAIT, NULL, NULL) = 4 [pid 2538704] epoll_wait(5, [], 8, 1000) = 0 [pid 2538704] epoll_wait(5, 0x7ffedc4a6320, 8, 1000) = -1 EINTR (Interrupted system call) and does nothing with them, as expected. Two epoll_wait() calls later, the syscall is interrupted, I'm not sure why and how we should react (in main(), passt.c) in that case.From: "Richard W.M. Jones" <rjones(a)redhat.com> This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- v2: - reuse fd_tap, we don't need a separate file descriptor - add F to optstring and usage(), for both passt and pasta - imply --one-off, we can't do much once the socket is closed With this, the trick from 5/5 goes a bit further: passt reads from the file descriptor passed by the wrapper.Thanks for the v2 .. I'll add it to my series and play with it.However, we get EPOLLRDHUP right away, from the close() on one end of the socket pair I guess. Should we just ignore all EPOLLRDHUP events, just the first one...?Does it see the event out-of-band or does it get an in-band read(2) == 0 after finishing reading the data?Removing EPOLLET (edge triggered) delivers the EPOLLIN event over and over again to passt as expected: 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 1299436 epoll_wait(5, 0x7ffdd8a65640, 8, 1000) = 1 1299436 recvfrom(4, "", 8323069, MSG_DONTWAIT, NULL, NULL) = 0 (forever)I'm not sure that's EPOLLIN, it could be anything (we don't check explicitly).but I expected that passt would exit the first time it reads 0 from the socket. tap_handler_passt() seems like it only considers the n<0 (error) and n>0 (data) cases. What do you think about checking for n == 0 and exiting immediately if c->one_off is true?I think it's reasonable in any case. At the same time, we don't necessarily need to ignore EPOLLRDHUP, we could also simply try to read from the socket first, that is, move this: if ((c->mode == MODE_PASST && tap_handler_passt(c, now)) || (c->mode == MODE_PASTA && tap_handler_pasta(c, now))) goto reinit; before this: if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) goto reinit; and we'll just get EBADF on recv(). If we have data to read, we won't necessarily get a EPOLLRDHUP (or EPOLLHUP) and EPOLLIN separately, so trying to read first makes sense for the case you described. An explicit check on EPOLLIN is probably a good idea too: read if we have anything to read, first, then quit if we have a reason to do so. -- Stefano
This is a hack. Ideally there'd be a way to build a "non-production" build of passt which would turn off all the encapsulation features. They are not relevant for fuzzing and simply add overhead. --- Makefile | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/Makefile b/Makefile index e370f0a..aa484d2 100644 --- a/Makefile +++ b/Makefile @@ -117,6 +117,19 @@ all: $(BIN) $(MANPAGES) docs static: FLAGS += -static -DGLIBC_NO_STATIC_NSS static: clean all +# XXX Hack for AFL instrumentation +EXTRA_SYSCALLS += \ + clone \ + getpid \ + gettid \ + madvise \ + mmap \ + mprotect \ + prctl \ + rt_sigprocmask \ + sched_yield \ + sigaltstack + seccomp.h: seccomp.sh $(PASST_SRCS) $(PASST_HEADERS) @ EXTRA_SYSCALLS="$(EXTRA_SYSCALLS)" ./seccomp.sh $(PASST_SRCS) $(PASST_HEADERS) -- 2.37.0.rc2
On Thu, 17 Nov 2022 12:26:13 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:This is a hack. Ideally there'd be a way to build a "non-production" build of passt which would turn off all the encapsulation features. They are not relevant for fuzzing and simply add overhead.I'm not sure how quantitatively relevant this is, but I was thinking about cases where sandboxing or "security" features cause issues (not necessarily security-relevant ones) that would be discovered by fuzzing. Partially fitting example: https://archives.passt.top/passt-dev/20221115012400.2240327-1-sbrivio@redha… there, perror() in glibc results in a dup() call, with seccomp terminating the proceess, in a way that was totally unexpected to me. Should fuzzing trigger a case like this one, without a seccomp filter loaded, we won't notice. Now, the guest affecting its own availability isn't security relevant, and that's the worst that can happen, but still it would be something to fix. Considering this, I'm actually more inclined to polish your hack (into an 'afl' Makefile target or similar). -- Stefano
And adjust it so it can be used to fuzz passt. Follow the instructions in README.fuzzing.md Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- .gitignore | 2 + fuzzing/Makefile | 15 +++ fuzzing/README.fuzzing.md | 36 +++++++ fuzzing/fuzz-wrapper.c | 173 +++++++++++++++++++++++++++++++++ fuzzing/testcase_dir/empty_tap | Bin 0 -> 4 bytes 5 files changed, 226 insertions(+) diff --git a/.gitignore b/.gitignore index d3d0e2c..2d001da 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,6 @@ *~ +/fuzzing/fuzz-wrapper +/fuzzing/sync_dir /passt /passt.avx2 /pasta diff --git a/fuzzing/Makefile b/fuzzing/Makefile new file mode 100644 index 0000000..ae5ecd8 --- /dev/null +++ b/fuzzing/Makefile @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: AGPL-3.0-or-later +# Copyright (c) 2021 Red Hat GmbH +# Author: Stefano Brivio <sbrivio(a)redhat.com> +# Author: Richard W.M. Jones <rjones(a)redhat.com> + +all: fuzz-wrapper + +CFLAGS := -g -O2 + +fuzz-wrapper: fuzz-wrapper.c + $(CC) $(FLAGS) $(CFLAGS) $^ -o $@ $(LDFLAGS) + +.PHONY: clean +clean: + rm -f fuzz-wrapper *~ *.o diff --git a/fuzzing/README.fuzzing.md b/fuzzing/README.fuzzing.md new file mode 100644 index 0000000..4d86b2f --- /dev/null +++ b/fuzzing/README.fuzzing.md @@ -0,0 +1,36 @@ +## Fuzzing with AFL++ (https://aflplus.plus/) + +1. In the top directory rebuild passt with AFL instrumentation, Clang + and ASAN: + +``` +make clean +AFL_USE_ASAN=1 make CC=/usr/bin/afl-clang-lto passt +``` + +2. In the fuzzing/ subdirectory, build the fuzzing wrapper *without* + instrumentation: + +``` +make fuzz-wrapper +``` + +3. Run AFL++ + +Create `fuzzing/sync_dir` and run multiple copies of afl-fuzz. +Usually you should run 1 master (-M) and as many slaves (-S) as you +can. + +Master: + +``` +mkdir -p sync_dir +afl-fuzz -i testcase_dir -o sync_dir -M fuzz01 ./fuzz-wrapper @@ +``` + +Slaves: + +``` +# replace fuzzNN with fuzz02, fuzz03, etc. +afl-fuzz -i testcase_dir -o sync_dir -S fuzzNN ./fuzz-wrapper @@ +``` diff --git a/fuzzing/fuzz-wrapper.c b/fuzzing/fuzz-wrapper.c new file mode 100644 index 0000000..9b5f0d8 --- /dev/null +++ b/fuzzing/fuzz-wrapper.c @@ -0,0 +1,173 @@ +/* Fuzzing wrapper + * Derived from libnbd fuzzing wrapper + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> +#include <stdint.h> +#include <inttypes.h> +#include <string.h> +#include <fcntl.h> +#include <unistd.h> +#include <poll.h> +#include <errno.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <sys/wait.h> + +#include <libnbd.h> + +static void passt (int s); +static void qemu (int fd, int s); + +int +main (int argc, char *argv[]) +{ + int fd; + pid_t pid; + int sv[2]; + + if (argc == 2) { + /* Open the test case before we fork so we know the file exists. */ + fd = open (argv[1], O_RDONLY); + if (fd == -1) { + fprintf (stderr, "fuzz-wrapper: "); + perror (argv[1]); + exit (EXIT_FAILURE); + } + } + else { + fprintf (stderr, "fuzz-wrapper testcase\n"); + exit (EXIT_FAILURE); + } + + /* Create a connected socket. */ + if (socketpair (AF_UNIX, SOCK_STREAM, 0, sv) == -1) { + perror ("fuzz-wrapper: socketpair"); + exit (EXIT_FAILURE); + } + + /* Fork: The parent will be the passt process. The child will be + * the phony qemu. + */ + pid = fork (); + if (pid == -1) { + perror ("fuzz-wrapper: fork"); + exit (EXIT_FAILURE); + } + + if (pid > 0) { + /* Parent: passt. */ + close (sv[1]); + close (fd); + + passt (sv[0]); + } + + /* Child: qemu. */ + close (sv[0]); + + qemu (fd, sv[1]); + + close (sv[1]); + + _exit (EXIT_SUCCESS); +} + +/* This is the parent process running passt. */ +static void +passt (int sock) +{ + char sock_str[32]; + + snprintf (sock_str, sizeof sock_str, "%d", sock); + /* XXX Assumes passt is compiled in the top directory: */ + execlp ("../passt", "passt", "-f", "-e", "-1", "--fd", sock_str, NULL); + perror ("fuzz-wrapper: execlp"); + _exit (EXIT_FAILURE); +} + +/* This is the child process acting like qemu. */ +static void +qemu (int fd, int sock) +{ + struct pollfd pfds[1]; + char rbuf[512], wbuf[512]; + size_t wsize = 0; + ssize_t r; + + for (;;) { + pfds[0].fd = sock; + pfds[0].events = POLLIN; + if (wsize > 0 || fd >= 0) pfds[0].events |= POLLOUT; + pfds[0].revents = 0; + + if (poll (pfds, 1, -1) == -1) { + if (errno == EINTR) + continue; + perror ("fuzz-wrapper: poll"); + /* This is not an error. */ + return; + } + + /* We can read from the passt socket. Just throw away anything sent. */ + if ((pfds[0].revents & POLLIN) != 0) { + r = read (sock, rbuf, sizeof rbuf); + if (r == -1 && errno != EINTR) { + perror ("fuzz-wrapper: read"); + return; + } + else if (r == 0) /* end of input from the server */ + return; + } + + /* We can write to the passt socket. */ + if ((pfds[0].revents & POLLOUT) != 0) { + /* Write more data from the wbuf. */ + if (wsize > 0) { + morewrite: + r = write (sock, wbuf, wsize); + if (r == -1 && errno != EAGAIN && errno != EWOULDBLOCK) { + perror ("fuzz-wrapper: write"); + return; + } + else if (r > 0) { + memmove (wbuf, &wbuf[r], wsize-r); + wsize -= r; + } + } + /* Write more data from the file. */ + else if (fd >= 0) { + r = read (fd, wbuf, sizeof wbuf); + if (r == -1) { + perror ("fuzz-wrapper: read"); + _exit (EXIT_FAILURE); + } + else if (r == 0) { + fd = -1; /* ignore the file from now on */ + shutdown (sock, SHUT_WR); + } + else { + wsize = r; + goto morewrite; + } + } + } + } /* for (;;) */ +} diff --git a/fuzzing/testcase_dir/empty_tap b/fuzzing/testcase_dir/empty_tap new file mode 100644 index 0000000000000000000000000000000000000000..593f4708db84ac8fd0f5cc47c634f38c013fe9e4 GIT binary patch literal 4 LcmZQzU|;|M00aO5 literal 0 HcmV?d00001 -- 2.37.0.rc2
Not an actual review, just two things I noticed while trying this: On Thu, 17 Nov 2022 12:26:14 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:diff --git a/fuzzing/README.fuzzing.md b/fuzzing/README.fuzzing.md new file mode 100644 index 0000000..4d86b2f --- /dev/null +++ b/fuzzing/README.fuzzing.md @@ -0,0 +1,36 @@ +## Fuzzing with AFL++ (https://aflplus.plus/) + +1. In the top directory rebuild passt with AFL instrumentation, Clang + and ASAN: + +``` +make clean +AFL_USE_ASAN=1 make CC=/usr/bin/afl-clang-lto passtI don't have afl-clang-lto, I used afl-clang-fast instead (there's also afl-clang, not sure which one is the most appropriate).[...] +++ b/fuzzing/fuzz-wrapper.c @@ -0,0 +1,173 @@ +/* Fuzzing wrapper + * Derived from libnbd fuzzing wrapper + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> +#include <stdint.h> +#include <inttypes.h> +#include <string.h> +#include <fcntl.h> +#include <unistd.h> +#include <poll.h> +#include <errno.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <sys/wait.h> + +#include <libnbd.h>This can probably be dropped (noticed because I didn't have libnbd-dev installed). -- Stefano
On Thu, 17 Nov 2022 12:26:09 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:Unfortunately I don't think the --fd option is working. stracing the code shows the socket being added to the epoll, but it somehow never gets read. It might be something obvious but I couldn't see what was wrong. (NB: The socket passed in is *connected* already).I'm looking into this (right now just for something obvious, if nothing pops up, a bit later) but actually I wonder: couldn't qemu() from 5/5 just call socket(AF_UNIX, SOCK_STREAM, 0) and connect() on it? Or am I missing something? Then sure, as you mentioned, this could be useful for something else, so it's probably worth it to get it working. -- Stefano
On Thu, Nov 17, 2022 at 02:58:57PM +0100, Stefano Brivio wrote:On Thu, 17 Nov 2022 12:26:09 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:We could create a temporary socket in the filesystem. --fd avoids the overhead of having to connect on one side and listen/accept on the other side, and having something in the filesystem which needs to be cleaned up. (Linux has the anonymous namespace which avoids cleanup but not the rest).Unfortunately I don't think the --fd option is working. stracing the code shows the socket being added to the epoll, but it somehow never gets read. It might be something obvious but I couldn't see what was wrong. (NB: The socket passed in is *connected* already).I'm looking into this (right now just for something obvious, if nothing pops up, a bit later) but actually I wonder: couldn't qemu() from 5/5 just call socket(AF_UNIX, SOCK_STREAM, 0) and connect() on it? Or am I missing something?Then sure, as you mentioned, this could be useful for something else, so it's probably worth it to get it working.Right - we have spent a lot of time adding fd passing to qemu, nbdkit & libnbd just because it's generally useful. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org