With this series, fuzzing actually works, albeit slowly. More on that below. Patches 1 & 2 are the same as before. Patch 3 is Stefano Brivio's modified patch (with some changes that we discussed together on IRC but otherwise unchanged). Patch 4 is new, but discussed already upstream: It changes the order in which EPOLLIN and EPOLLRDHUP events are processed, so that we don't drop packets when the socket is closed. Patches 5 & 6 are the hacks that were needed to get fuzzing to work. Patch 6 removes all seccomp and other isolation stuff because for unclear reasons that breaks AFL instrumentation. AFL appears to fork off a second process, and somehow strace cannot follow that process, but the second process fails, and that breaks AFL completely. Without strace data it's rather hard to see what's going on so I didn't investigate this further. Patch 7 adds the fuzzing wrapper and is not greatly changed from before. However I did have to disable the AFL "fork server" optimization which somehow doesn't work with passt (it does work fine with libnbd & nbdkit). Speed is not great. I'm getting ~ 75-80 execs/second. Really we want this to be much higher, since that ultimately governs how fast we can explore new code paths and find bugs. Ideally well over 1000 execs/s (per fuzzing process) would be a good target to aim for. (Of course this depends on the hardware as well.) We could try to find out why the fork server is not compatible with passt, but I don't think we'd gain very much performance there. To explore this I ran a dummy fuzzed process from the same wrapper, and it was hardly any faster. I think the real gains are going to come from reducing the overhead of starting passt. There seem to be some netlink messages sent during start up and maybe if those could be reduced or eliminated we might see better performance. The other factor is fuzzing stability, which hovers around 87-90%. While this isn't necessarily bad, I wonder where the non-determinism is coming from [ideal figures would be 95-100%]. Passt doesn't appear to use threads. It does call getrandom (for TCP sequence numbers) so it'd be good to have a way to bypass that. However I don't fully understand what's going on here. Rich.
If you run the build several times it will fail unnecessarily with: ln -s passt pasta ln: failed to create symbolic link 'pasta': File exists make: *** [Makefile:134: pasta] Error 1 Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 1dc2df5..f8ecaea 100644 --- a/Makefile +++ b/Makefile @@ -133,7 +133,7 @@ passt.avx2: $(PASST_SRCS) $(HEADERS) passt.avx2: passt pasta.avx2 pasta.1 pasta: pasta%: passt% - ln -s $< $@ + ln -sf $< $@ qrap: $(QRAP_SRCS) passt.h $(CC) $(FLAGS) $(CFLAGS) $(CPPFLAGS) $(QRAP_SRCS) -o qrap $(LDFLAGS) -- 2.37.0.rc2
On Thu, Nov 17, 2022 at 06:49:32PM +0000, Richard W.M. Jones wrote:If you run the build several times it will fail unnecessarily with: ln -s passt pasta ln: failed to create symbolic link 'pasta': File exists make: *** [Makefile:134: pasta] Error 1 Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au> I'm not sure why I haven't hit that before.--- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 1dc2df5..f8ecaea 100644 --- a/Makefile +++ b/Makefile @@ -133,7 +133,7 @@ passt.avx2: $(PASST_SRCS) $(HEADERS) passt.avx2: passt pasta.avx2 pasta.1 pasta: pasta%: passt% - ln -s $< $@ + ln -sf $< $@ qrap: $(QRAP_SRCS) passt.h $(CC) $(FLAGS) $(CFLAGS) $(CPPFLAGS) $(QRAP_SRCS) -o qrap $(LDFLAGS)-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Fri, 18 Nov 2022 12:30:25 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Thu, Nov 17, 2022 at 06:49:32PM +0000, Richard W.M. Jones wrote:I'm also a bit puzzled: after all, if 'pasta' exists, the 'pasta' target is up to date. Then I hit this while doing stuff on Debian packages, I don't remember what the exact sequence was. Well, in any case, it looks reasonable. -- StefanoIf you run the build several times it will fail unnecessarily with: ln -s passt pasta ln: failed to create symbolic link 'pasta': File exists make: *** [Makefile:134: pasta] Error 1 Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au> I'm not sure why I haven't hit that before.
These files are left around by emacs amongst other editors. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index f8ecaea..ced7af2 100644 --- a/Makefile +++ b/Makefile @@ -146,7 +146,7 @@ valgrind: all .PHONY: clean clean: - $(RM) $(BIN) *.o seccomp.h pasta.1 \ + $(RM) $(BIN) *~ *.o seccomp.h pasta.1 \ passt.tar passt.tar.gz *.deb *.rpm \ passt.pid README.plain.md -- 2.37.0.rc2
On Thu, Nov 17, 2022 at 06:49:33PM +0000, Richard W.M. Jones wrote:These files are left around by emacs amongst other editors. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com>Heh, I've been meaning to add that, but didn't get to it yet. Thanks! Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index f8ecaea..ced7af2 100644 --- a/Makefile +++ b/Makefile @@ -146,7 +146,7 @@ valgrind: all .PHONY: clean clean: - $(RM) $(BIN) *.o seccomp.h pasta.1 \ + $(RM) $(BIN) *~ *.o seccomp.h pasta.1 \ passt.tar passt.tar.gz *.deb *.rpm \ passt.pid README.plain.md-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
This passes a fully connected stream socket to passt. Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> [sbrivio: reuse fd_tap instead of adding a new descriptor, imply --one-off on --fd, add to optstring and usage()] Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 28 ++++++++++++++++++++++++++-- passt.1 | 10 ++++++++++ passt.c | 1 - passt.h | 2 +- tap.c | 9 +++++++++ 5 files changed, 46 insertions(+), 4 deletions(-) diff --git a/conf.c b/conf.c index a995eb7..9ec346f 100644 --- a/conf.c +++ b/conf.c @@ -719,6 +719,7 @@ static void usage(const char *name) UNIX_SOCK_PATH, 1); } + info( " -F, --fd FD Use FD as pre-opened connected socket"); info( " -p, --pcap FILE Log tap-facing traffic to pcap file"); info( " -P, --pid FILE Write own PID to the given file"); info( " -m, --mtu MTU Assign MTU via DHCP/NDP"); @@ -1079,6 +1080,7 @@ void conf(struct ctx *c, int argc, char **argv) {"log-file", required_argument, NULL, 'l' }, {"help", no_argument, NULL, 'h' }, {"socket", required_argument, NULL, 's' }, + {"fd", required_argument, NULL, 'F' }, {"ns-ifname", required_argument, NULL, 'I' }, {"pcap", required_argument, NULL, 'p' }, {"pid", required_argument, NULL, 'P' }, @@ -1138,9 +1140,9 @@ void conf(struct ctx *c, int argc, char **argv) if (c->mode == MODE_PASTA) { c->no_dhcp_dns = c->no_dhcp_dns_search = 1; - optstring = "dqfel:hI:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; + optstring = "dqfel:hF:I:p:P:m:a:n:M:g:i:D:S:46t:u:T:U:"; } else { - optstring = "dqfel:hs:p:P:m:a:n:M:g:i:D:S:461t:u:"; + optstring = "dqfel:hs:F:p:P:m:a:n:M:g:i:D:S:461t:u:"; } c->tcp.fwd_in.mode = c->tcp.fwd_out.mode = 0; @@ -1355,6 +1357,23 @@ void conf(struct ctx *c, int argc, char **argv) err("Invalid socket path: %s", optarg); usage(argv[0]); } + break; + case 'F': + if (c->fd_tap >= 0) { + err("Multiple --fd options given"); + usage(argv[0]); + } + + errno = 0; + c->fd_tap = strtol(optarg, NULL, 0); + + if (c->fd_tap < 0 || errno) { + err("Invalid --fd: %s", optarg); + usage(argv[0]); + } + + c->one_off = true; + break; case 'I': if (*c->pasta_ifn) { @@ -1590,6 +1609,11 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } + if (*c->sock_path && c->fd_tap >= 0) { + err("Options --socket and --fd are mutually exclusive"); + usage(argv[0]); + } + ret = conf_ugid(runas, &uid, &gid); if (ret) usage(argv[0]); diff --git a/passt.1 b/passt.1 index e34a3e0..528763b 100644 --- a/passt.1 +++ b/passt.1 @@ -297,6 +297,16 @@ Path for UNIX domain socket used by \fBqemu\fR(1) or \fBqrap\fR(1) to connect to Default is to probe a free socket, not accepting connections, starting from \fI/tmp/passt_1.socket\fR to \fI/tmp/passt_64.socket\fR. +.TP +.BR \-F ", " \-\-fd " " \fIFD +Pass a pre-opened, connected socket to \fBpasst\fR. Usually the socket is opened +in the parent process and \fBpasst\fR inherits it when run as a child. This +allows the parent process to open sockets using another address family or +requiring special privileges. + +This option implies the behaviour described for \-\-one-off, once this socket +is closed. + .TP .BR \-1 ", " \-\-one-off Quit after handling a single client connection, that is, once the client closes diff --git a/passt.c b/passt.c index 7d323c2..8b2c50d 100644 --- a/passt.c +++ b/passt.c @@ -255,7 +255,6 @@ int main(int argc, char **argv) quit_fd = pasta_netns_quit_init(&c); - c.fd_tap = c.fd_tap_listen = -1; tap_sock_init(&c); clock_gettime(CLOCK_MONOTONIC, &now); diff --git a/passt.h b/passt.h index 6649c0a..ca25b90 100644 --- a/passt.h +++ b/passt.h @@ -159,7 +159,7 @@ struct ip6_ctx { * @proc_net_udp: Stored handles for /proc/net/udp{,6} in init and ns * @epollfd: File descriptor for epoll instance * @fd_tap_listen: File descriptor for listening AF_UNIX socket, if any - * @fd_tap: File descriptor for AF_UNIX socket or tuntap device + * @fd_tap: AF_UNIX socket, tuntap device, or pre-opened socket * @mac: Host MAC address * @mac_guest: MAC address of guest or namespace, seen or configured * @ifi4: Index of routable interface for IPv4, 0 if IPv4 disabled diff --git a/tap.c b/tap.c index d26af58..9998127 100644 --- a/tap.c +++ b/tap.c @@ -1069,6 +1069,15 @@ void tap_sock_init(struct ctx *c) } if (c->fd_tap != -1) { + if (c->one_off) { /* Passed as --fd */ + struct epoll_event ev = { 0 }; + + ev.data.fd = c->fd_tap; + ev.events = EPOLLIN | EPOLLET | EPOLLRDHUP; + epoll_ctl(c->epollfd, EPOLL_CTL_ADD, c->fd_tap, &ev); + return; + } + epoll_ctl(c->epollfd, EPOLL_CTL_DEL, c->fd_tap, NULL); close(c->fd_tap); c->fd_tap = -1; -- 2.37.0.rc2
In the case where the client writes a packet and then closes the socket, because we receive EPOLLIN|EPOLLRDHUP together we have a choice of whether to close the socket immediately, or read the packet and then close the socket. Choose the latter. This should improve fuzzing coverage and arguably is a better choice even for regular use since dropping packets on close is bad. See-also: https://archives.passt.top/passt-dev/20221117171805.3746f53a@elisabeth/ Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- tap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tap.c b/tap.c index 9998127..d97af6a 100644 --- a/tap.c +++ b/tap.c @@ -1106,13 +1106,13 @@ void tap_handler(struct ctx *c, int fd, uint32_t events, return; } - if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) - goto reinit; - if ((c->mode == MODE_PASST && tap_handler_passt(c, now)) || (c->mode == MODE_PASTA && tap_handler_pasta(c, now))) goto reinit; + if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) + goto reinit; + return; reinit: if (c->one_off) { -- 2.37.0.rc2
On Thu, Nov 17, 2022 at 06:49:35PM +0000, Richard W.M. Jones wrote:In the case where the client writes a packet and then closes the socket, because we receive EPOLLIN|EPOLLRDHUP together we have a choice of whether to close the socket immediately, or read the packet and then close the socket. Choose the latter. This should improve fuzzing coverage and arguably is a better choice even for regular use since dropping packets on close is bad. See-also: https://archives.passt.top/passt-dev/20221117171805.3746f53a@elisabeth/ Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- tap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tap.c b/tap.c index 9998127..d97af6a 100644 --- a/tap.c +++ b/tap.c @@ -1106,13 +1106,13 @@ void tap_handler(struct ctx *c, int fd, uint32_t events, return; } - if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) - goto reinit; - if ((c->mode == MODE_PASST && tap_handler_passt(c, now)) || (c->mode == MODE_PASTA && tap_handler_pasta(c, now))) goto reinit; + if (events & (EPOLLRDHUP | EPOLLHUP | EPOLLERR)) + goto reinit; + return; reinit: if (c->one_off) {-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
This is a hack. Ideally there'd be a way to build a "non-production" build of passt which would turn off all the encapsulation features. They are not relevant for fuzzing and simply add overhead. --- Makefile | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/Makefile b/Makefile index ced7af2..ee496fe 100644 --- a/Makefile +++ b/Makefile @@ -119,6 +119,19 @@ all: $(BIN) $(MANPAGES) docs static: FLAGS += -static -DGLIBC_NO_STATIC_NSS static: clean all +# XXX Hack for AFL instrumentation +EXTRA_SYSCALLS += \ + clone \ + getpid \ + gettid \ + madvise \ + mmap \ + mprotect \ + prctl \ + rt_sigprocmask \ + sched_yield \ + sigaltstack + seccomp.h: seccomp.sh $(PASST_SRCS) $(PASST_HEADERS) @ EXTRA_SYSCALLS="$(EXTRA_SYSCALLS)" ./seccomp.sh $(PASST_SRCS) $(PASST_HEADERS) -- 2.37.0.rc2
I suspect these interfere in some way with AFL instrumentation. The instrumented binary deadlocks without this patch, but it's hard to understand why just looking at strace output. --- conf.c | 2 +- passt.c | 6 ++++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/conf.c b/conf.c index 9ec346f..fcc5664 100644 --- a/conf.c +++ b/conf.c @@ -1655,7 +1655,7 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } - isolate_user(uid, gid, !netns_only, userns, c->mode); + //isolate_user(uid, gid, !netns_only, userns, c->mode); if (c->pasta_conf_ns) c->no_ra = 1; diff --git a/passt.c b/passt.c index 8b2c50d..2a4e65a 100644 --- a/passt.c +++ b/passt.c @@ -185,7 +185,7 @@ int main(int argc, char **argv) arch_avx2_exec(argv); - isolate_initial(); + //isolate_initial(); c.pasta_netns_fd = c.fd_tap = c.fd_tap_listen = -1; @@ -291,17 +291,19 @@ int main(int argc, char **argv) } } +#if 0 if (isolate_prefork(&c)) { err("Failed to sandbox process, exiting\n"); exit(EXIT_FAILURE); } +#endif if (!c.foreground) __daemon(pidfile_fd, devnull_fd); else write_pidfile(pidfile_fd, getpid()); - isolate_postfork(&c); + //isolate_postfork(&c); timer_init(&c, &now); -- 2.37.0.rc2
And adjust it so it can be used to fuzz passt. Follow the instructions in README.fuzzing.md Signed-off-by: Richard W.M. Jones <rjones(a)redhat.com> --- .gitignore | 2 + fuzzing/Makefile | 15 +++ fuzzing/README.fuzzing.md | 43 +++++++++ fuzzing/fuzz-wrapper.c | 171 +++++++++++++++++++++++++++++++++ fuzzing/testcase_dir/empty_tap | Bin 0 -> 4 bytes 5 files changed, 231 insertions(+) diff --git a/.gitignore b/.gitignore index d3d0e2c..2d001da 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,6 @@ *~ +/fuzzing/fuzz-wrapper +/fuzzing/sync_dir /passt /passt.avx2 /pasta diff --git a/fuzzing/Makefile b/fuzzing/Makefile new file mode 100644 index 0000000..ae5ecd8 --- /dev/null +++ b/fuzzing/Makefile @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: AGPL-3.0-or-later +# Copyright (c) 2021 Red Hat GmbH +# Author: Stefano Brivio <sbrivio(a)redhat.com> +# Author: Richard W.M. Jones <rjones(a)redhat.com> + +all: fuzz-wrapper + +CFLAGS := -g -O2 + +fuzz-wrapper: fuzz-wrapper.c + $(CC) $(FLAGS) $(CFLAGS) $^ -o $@ $(LDFLAGS) + +.PHONY: clean +clean: + rm -f fuzz-wrapper *~ *.o diff --git a/fuzzing/README.fuzzing.md b/fuzzing/README.fuzzing.md new file mode 100644 index 0000000..8a8a7f3 --- /dev/null +++ b/fuzzing/README.fuzzing.md @@ -0,0 +1,43 @@ +## Fuzzing with AFL++ (https://aflplus.plus/) + +1. In the top directory rebuild passt with AFL instrumentation, Clang + and ASAN: + +``` +make clean +AFL_USE_ASAN=1 make CC=/usr/bin/afl-clang-fast passt +``` + +2. In the fuzzing/ subdirectory, build the fuzzing wrapper *without* + instrumentation: + +``` +cd fuzzing +make fuzz-wrapper +``` + +3. Run AFL++ + +Create `fuzzing/sync_dir` and run multiple copies of afl-fuzz. +Usually you should run 1 master (-M) and as many slaves (-S) as you +can. + +Master: + +``` +cd fuzzing +mkdir -p sync_dir +export AFL_SKIP_BIN_CHECK=1 +export AFL_NO_FORKSRV=1 +afl-fuzz -i testcase_dir -o sync_dir -M fuzz01 ./fuzz-wrapper @@ +``` + +Slaves: + +``` +cd fuzzing +export AFL_SKIP_BIN_CHECK=1 +export AFL_NO_FORKSRV=1 +# replace fuzzNN with fuzz02, fuzz03, etc. +afl-fuzz -i testcase_dir -o sync_dir -S fuzzNN ./fuzz-wrapper @@ +``` diff --git a/fuzzing/fuzz-wrapper.c b/fuzzing/fuzz-wrapper.c new file mode 100644 index 0000000..9e2bb43 --- /dev/null +++ b/fuzzing/fuzz-wrapper.c @@ -0,0 +1,171 @@ +/* Fuzzing wrapper + * Derived from libnbd fuzzing wrapper + * Copyright (C) 2013-2022 Red Hat Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <stdio.h> +#include <stdlib.h> +#include <stdbool.h> +#include <stdint.h> +#include <inttypes.h> +#include <string.h> +#include <fcntl.h> +#include <unistd.h> +#include <poll.h> +#include <errno.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <sys/wait.h> + +static void passt (int s); +static void qemu (int fd, int s); + +int +main (int argc, char *argv[]) +{ + int fd; + pid_t pid; + int sv[2]; + + if (argc == 2) { + /* Open the test case before we fork so we know the file exists. */ + fd = open (argv[1], O_RDONLY); + if (fd == -1) { + fprintf (stderr, "fuzz-wrapper: "); + perror (argv[1]); + exit (EXIT_FAILURE); + } + } + else { + fprintf (stderr, "fuzz-wrapper testcase\n"); + exit (EXIT_FAILURE); + } + + /* Create a connected socket. */ + if (socketpair (AF_UNIX, SOCK_STREAM, 0, sv) == -1) { + perror ("fuzz-wrapper: socketpair"); + exit (EXIT_FAILURE); + } + + /* Fork: The parent will be the passt process. The child will be + * the phony qemu. + */ + pid = fork (); + if (pid == -1) { + perror ("fuzz-wrapper: fork"); + exit (EXIT_FAILURE); + } + + if (pid > 0) { + /* Parent: passt. */ + close (sv[1]); + close (fd); + + passt (sv[0]); + } + + /* Child: qemu. */ + close (sv[0]); + + qemu (fd, sv[1]); + + close (sv[1]); + + _exit (EXIT_SUCCESS); +} + +/* This is the parent process running passt. */ +static void +passt (int sock) +{ + char sock_str[32]; + + snprintf (sock_str, sizeof sock_str, "%d", sock); + /* XXX Assumes passt is compiled in the top directory: */ + execlp ("../passt", "passt", "-f", "-e", "-1", "--fd", sock_str, NULL); + perror ("fuzz-wrapper: execlp"); + _exit (EXIT_FAILURE); +} + +/* This is the child process acting like qemu. */ +static void +qemu (int fd, int sock) +{ + struct pollfd pfds[1]; + char rbuf[512], wbuf[512]; + size_t wsize = 0; + ssize_t r; + + for (;;) { + pfds[0].fd = sock; + pfds[0].events = POLLIN; + if (wsize > 0 || fd >= 0) pfds[0].events |= POLLOUT; + pfds[0].revents = 0; + + if (poll (pfds, 1, -1) == -1) { + if (errno == EINTR) + continue; + perror ("fuzz-wrapper: poll [ignored]"); + /* This is not an error. */ + return; + } + + /* We can read from the passt socket. Just throw away anything sent. */ + if ((pfds[0].revents & POLLIN) != 0) { + r = read (sock, rbuf, sizeof rbuf); + if (r == -1 && errno != EINTR) { + perror ("fuzz-wrapper: read [ignored]"); + return; + } + else if (r == 0) /* end of input from the server */ + return; + } + + /* We can write to the passt socket. */ + if ((pfds[0].revents & POLLOUT) != 0) { + /* Write more data from the wbuf. */ + if (wsize > 0) { + morewrite: + r = write (sock, wbuf, wsize); + if (r == -1 && errno != EAGAIN && errno != EWOULDBLOCK) { + perror ("fuzz-wrapper: write [ignored]"); + return; + } + else if (r > 0) { + memmove (wbuf, &wbuf[r], wsize-r); + wsize -= r; + } + } + /* Write more data from the file. */ + else if (fd >= 0) { + r = read (fd, wbuf, sizeof wbuf); + if (r == -1) { + perror ("fuzz-wrapper: read"); + _exit (EXIT_FAILURE); + } + else if (r == 0) { + fd = -1; /* ignore the file from now on */ + shutdown (sock, SHUT_WR); + } + else { + wsize = r; + goto morewrite; + } + } + } + } /* for (;;) */ +} diff --git a/fuzzing/testcase_dir/empty_tap b/fuzzing/testcase_dir/empty_tap new file mode 100644 index 0000000000000000000000000000000000000000..593f4708db84ac8fd0f5cc47c634f38c013fe9e4 GIT binary patch literal 4 LcmZQzU|;|M00aO5 literal 0 HcmV?d00001 -- 2.37.0.rc2
I forgot to mention that I didn't sort out test cases yet. The current test case (single zero length packet) is not ideal. We might have a single test case, say a single TCP SYN packet. However if there are distinct areas of functionality in passt (eg. TCP connect, ARP, DNS, DHCP, ...), *and* if those are geometrically very far apart in the search space, then you could argue for having one test case per major feature. Capturing that into tap files will be a fun afternoon for someone. More on this topic here: https://aflplus.plus/docs/fuzzing_in_depth/#2-preparing-the-fuzzing-campaign - - - Also ... while we do have --fd functionality, even better would be to modify passt so it can read input from stdin and drop output. See: https://aflplus.plus/docs/best_practices/#fuzzing-a-network-service The socketpair / --fd approach is a bit cleaner. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
I was wondering if --fd 0 would work, but it doesn't. This hangs: $ ../passt -1 -f -e --fd 0 < testcase_dir/empty_tap It's clear from strace why this is. Trying to add fd 0 to epoll returns -EPERM. The epoll_ctl(2) manual says: EPERM The target file fd does not support epoll. This error can occur if fd refers to, for example, a regular file or a directory. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
On Thu, 17 Nov 2022 18:49:31 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:With this series, fuzzing actually works, albeit slowly. More on that below. Patches 1 & 2 are the same as before. Patch 3 is Stefano Brivio's modified patch (with some changes that we discussed together on IRC but otherwise unchanged). Patch 4 is new, but discussed already upstream: It changes the order in which EPOLLIN and EPOLLRDHUP events are processed, so that we don't drop packets when the socket is closed. Patches 5 & 6 are the hacks that were needed to get fuzzing to work. Patch 6 removes all seccomp and other isolation stuff because for unclear reasons that breaks AFL instrumentation. AFL appears to fork off a second process, and somehow strace cannot follow that process, but the second process fails, and that breaks AFL completely. Without strace data it's rather hard to see what's going on so I didn't investigate this further. Patch 7 adds the fuzzing wrapper and is not greatly changed from before. However I did have to disable the AFL "fork server" optimization which somehow doesn't work with passt (it does work fine with libnbd & nbdkit). Speed is not great. I'm getting ~ 75-80 execs/second. Really we want this to be much higher, since that ultimately governs how fast we can explore new code paths and find bugs. Ideally well over 1000 execs/s (per fuzzing process) would be a good target to aim for. (Of course this depends on the hardware as well.)Applied up to 4/7, thanks! For the rest: I have a local branch with 5/7 and 6/7 fixed up: the 'fuzzing' Makefile target enables the syscalls you listed and avoids prctl(PR_SET_DUMPABLE, 0) in isolation_postfork() via -DFUZZING. I managed to speed things up by skipping some operations not needed for fuzzing, but just a tiny bit (~200 execs/s). I'm looking into switching to persistent mode: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/READ… and introducing frames with special values, as you hinted on IRC, for example one-byte frames with commands such as "go ahead with socket processing then come back to 'tap' frames", so that passt has a chance to do some meaningful socket-side operations before getting back to fuzz input. Patch 7/7 is very useful and appreciated anyway as it demystifies the whole topic for me, and we can probably recycle most of the documentation. I'm not sure yet how/if the wrapper still fits with the stuff I'm looking into. -- Stefano
On Fri, Nov 25, 2022 at 10:23:54AM +0100, Stefano Brivio wrote:and introducing frames with special values, as you hinted on IRC, for example one-byte frames with commands such as "go ahead with socket processing then come back to 'tap' frames", so that passt has a chance to do some meaningful socket-side operations before getting back to fuzz input.You can improve the chance that the fuzzer will find these frames by either including them in test cases (we need better test cases, which is separate issue), or by making the encoding such that they are easy to find. eg. if you had four possible values, encode them only in the bottom two bits and ignore the higher bits. Since these frames are only used for fuzzing you can change the meaning of them later, so exact encoding isn't an ABI issue.Patch 7/7 is very useful and appreciated anyway as it demystifies the whole topic for me, and we can probably recycle most of the documentation. I'm not sure yet how/if the wrapper still fits with the stuff I'm looking into.It would definitely be better to have passt itself be able to read a file off disk. For example when we fuzz nbdkit we do not used or need a wrapper, because nbdkit has an -s / --single option that reads from stdin and writes to stdout. This was originally added to inetd support. We drive nbdkit from the fuzzer directly like this: afl-fuzz -i fuzzing/testcase_dir -o fuzzing/sync_dir -M fuzz01 \ ./server/nbdkit -s -t 1 ./plugins/memory/.libs/nbdkit-memory-plugin.so 1M (https://gitlab.com/nbdkit/nbdkit/-/blob/ef035f7090d8bec2700ef1f941e371d351d…) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
On Fri, 25 Nov 2022 10:11:03 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:On Fri, Nov 25, 2022 at 10:23:54AM +0100, Stefano Brivio wrote:Right, yes, I was thinking we could, under #ifdef FUZZING, accept frames that are shorter than a 802.3 header (up to 13 bytes), and take their length (in network order, but I guess AFL++ can easily get familiar with it) as fuzzing flow commands. About test cases, I'm not sure this should be included in regular test runs, because there's no reasonable definition for a test duration. I'd rather have a separate script which keeps running indefinitely, updating sources as they become available.and introducing frames with special values, as you hinted on IRC, for example one-byte frames with commands such as "go ahead with socket processing then come back to 'tap' frames", so that passt has a chance to do some meaningful socket-side operations before getting back to fuzz input.You can improve the chance that the fuzzer will find these frames by either including them in test cases (we need better test cases, which is separate issue), or by making the encoding such that they are easy to find. eg. if you had four possible values, encode them only in the bottom two bits and ignore the higher bits. Since these frames are only used for fuzzing you can change the meaning of them later, so exact encoding isn't an ABI issue.Ah, I didn't know, interesting. On the other hand, would it really make sense to add support for that (which has probably no use other than fuzzing) instead of a hack with __AFL_FUZZ_TESTCASE_BUF and mmap()? According to AFL++ documentation that should speed things up considerably: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/READ… -- StefanoPatch 7/7 is very useful and appreciated anyway as it demystifies the whole topic for me, and we can probably recycle most of the documentation. I'm not sure yet how/if the wrapper still fits with the stuff I'm looking into.It would definitely be better to have passt itself be able to read a file off disk. For example when we fuzz nbdkit we do not used or need a wrapper, because nbdkit has an -s / --single option that reads from stdin and writes to stdout. This was originally added to inetd support. We drive nbdkit from the fuzzer directly like this: afl-fuzz -i fuzzing/testcase_dir -o fuzzing/sync_dir -M fuzz01 \ ./server/nbdkit -s -t 1 ./plugins/memory/.libs/nbdkit-memory-plugin.so 1M (https://gitlab.com/nbdkit/nbdkit/-/blob/ef035f7090d8bec2700ef1f941e371d351d…)
On Tue, Nov 29, 2022 at 02:34:42PM +0100, Stefano Brivio wrote:On Fri, 25 Nov 2022 10:11:03 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:Yes for libnbd/nbdkit/hivex we don't (can't) fuzz as part of regular tests. It's a separate process. You might be able to apply to this programme: https://google.github.io/oss-fuzz/ (You'll still need to add the fuzzing support upstream.)On Fri, Nov 25, 2022 at 10:23:54AM +0100, Stefano Brivio wrote:Right, yes, I was thinking we could, under #ifdef FUZZING, accept frames that are shorter than a 802.3 header (up to 13 bytes), and take their length (in network order, but I guess AFL++ can easily get familiar with it) as fuzzing flow commands. About test cases, I'm not sure this should be included in regular test runs, because there's no reasonable definition for a test duration. I'd rather have a separate script which keeps running indefinitely, updating sources as they become available.and introducing frames with special values, as you hinted on IRC, for example one-byte frames with commands such as "go ahead with socket processing then come back to 'tap' frames", so that passt has a chance to do some meaningful socket-side operations before getting back to fuzz input.You can improve the chance that the fuzzer will find these frames by either including them in test cases (we need better test cases, which is separate issue), or by making the encoding such that they are easy to find. eg. if you had four possible values, encode them only in the bottom two bits and ignore the higher bits. Since these frames are only used for fuzzing you can change the meaning of them later, so exact encoding isn't an ABI issue.Yes if you can get __AFL_FUZZ_TESTCASE_BUF working that would be even better. For nbdkit it seemed like it would be quite difficult because we would need to ensure that the binary can be completely "reset" between runs without being re-exec'd, so we'd need to chase down every possible allocation / initialization and make sure it's reversed. We get pretty good fuzzing performance by the method above so it was never a priority. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-topAh, I didn't know, interesting. On the other hand, would it really make sense to add support for that (which has probably no use other than fuzzing) instead of a hack with __AFL_FUZZ_TESTCASE_BUF and mmap()? According to AFL++ documentation that should speed things up considerably: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/READ…Patch 7/7 is very useful and appreciated anyway as it demystifies the whole topic for me, and we can probably recycle most of the documentation. I'm not sure yet how/if the wrapper still fits with the stuff I'm looking into.It would definitely be better to have passt itself be able to read a file off disk. For example when we fuzz nbdkit we do not used or need a wrapper, because nbdkit has an -s / --single option that reads from stdin and writes to stdout. This was originally added to inetd support. We drive nbdkit from the fuzzer directly like this: afl-fuzz -i fuzzing/testcase_dir -o fuzzing/sync_dir -M fuzz01 \ ./server/nbdkit -s -t 1 ./plugins/memory/.libs/nbdkit-memory-plugin.so 1M (https://gitlab.com/nbdkit/nbdkit/-/blob/ef035f7090d8bec2700ef1f941e371d351d…)
On Tue, Nov 29, 2022 at 02:34:42PM +0100, Stefano Brivio wrote:On Fri, 25 Nov 2022 10:11:03 +0000 "Richard W.M. Jones" <rjones(a)redhat.com> wrote:I agree we don't want to do fuzzing per se as part of the regular test. However, it would be nice to replay specific fuzz-generated scenarios where we've previously found bugs, to avoid regression. I'm not really sure what would be involved in that.On Fri, Nov 25, 2022 at 10:23:54AM +0100, Stefano Brivio wrote:Right, yes, I was thinking we could, under #ifdef FUZZING, accept frames that are shorter than a 802.3 header (up to 13 bytes), and take their length (in network order, but I guess AFL++ can easily get familiar with it) as fuzzing flow commands. About test cases, I'm not sure this should be included in regular test runs, because there's no reasonable definition for a test duration. I'd rather have a separate script which keeps running indefinitely, updating sources as they become available.and introducing frames with special values, as you hinted on IRC, for example one-byte frames with commands such as "go ahead with socket processing then come back to 'tap' frames", so that passt has a chance to do some meaningful socket-side operations before getting back to fuzz input.You can improve the chance that the fuzzer will find these frames by either including them in test cases (we need better test cases, which is separate issue), or by making the encoding such that they are easy to find. eg. if you had four possible values, encode them only in the bottom two bits and ignore the higher bits. Since these frames are only used for fuzzing you can change the meaning of them later, so exact encoding isn't an ABI issue.-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonAh, I didn't know, interesting. On the other hand, would it really make sense to add support for that (which has probably no use other than fuzzing) instead of a hack with __AFL_FUZZ_TESTCASE_BUF and mmap()? According to AFL++ documentation that should speed things up considerably: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/READ…Patch 7/7 is very useful and appreciated anyway as it demystifies the whole topic for me, and we can probably recycle most of the documentation. I'm not sure yet how/if the wrapper still fits with the stuff I'm looking into.It would definitely be better to have passt itself be able to read a file off disk. For example when we fuzz nbdkit we do not used or need a wrapper, because nbdkit has an -s / --single option that reads from stdin and writes to stdout. This was originally added to inetd support. We drive nbdkit from the fuzzer directly like this: afl-fuzz -i fuzzing/testcase_dir -o fuzzing/sync_dir -M fuzz01 \ ./server/nbdkit -s -t 1 ./plugins/memory/.libs/nbdkit-memory-plugin.so 1M (https://gitlab.com/nbdkit/nbdkit/-/blob/ef035f7090d8bec2700ef1f941e371d351d…)