Hi, I just tested the master code, the passt just exited without any error. ``` $ ./passt Outbound interface: ens192 ARP: address: 00:50:56:be:9d:1f DHCP: assign: 192.168.64.217 mask: 255.255.240.0 router: 192.168.64.1 search: . UNIX domain socket bound at /tmp/passt_1.socket You can now start qrap: ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio or directly qemu, patched with: qemu/0001-net-Allow-also-UNIX-domain-sockets-to-be-used-as-net.patch as follows: kvm ... -net socket,connect=/tmp/passt_1.socket -net nic,model=virtio 21-10-26 13:23:43 root@192.168.64.217:~/passt(master✗) ``` Another terminal: ``` $ bash x.sh recv: Connection reset by peer Probe of /tmp/passt_1.socket failed connect: No such file or directory Probe of /tmp/passt_2.socket failed connect: No such file or directory Probe of /tmp/passt_3.socket failed ... ``` ``` $ cat x.sh ./qrap 5 /root/qemu-master/build/qemu-system-x86_64 -uuid 1869b108-42b3-42a7-852e-70261d73f6a9 -name guest=1869b108-42b3-42a7-852e-70261d73f6a9 -cpu host -enable-kvm -smp 4 -device pci-bridge,chassis_nr=1,id=pci.0 -device pci-bridge,chassis_nr=1,id=pci.1 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x10 -fsdev local,security_model=mapped,id=fsdev-fs0,path=/root -device virtio-9p-pci,id=fs0,fsdev=fsdev-fs0,mount_tag=fs0 -device intel-iommu,device-iotlb=on,intremap=on -M q35,accel=kvm,kernel-irqchip=split -serial mon:stdio -nographic -m size=2048M,maxmem=32G,slots=128 -device virtio-balloon -drive file=/root/fedora34.qcow2,format=qcow2,cache=none,aio=native,if=none,id=drive-virtio-disk1,file.locking=off -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=1 -vnc 0.0.0.0:101 -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-222-29ec19c9-d330-4949-b/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -qmp tcp:0.0.0.0:2235,server,nowait -device virtio-serial-pci,id=virtio-serial0,max_ports=16 -net socket,fd=5 -net nic,model=virtio ``` Thanks, Feng Li
Add cc. On Tue, Oct 26, 2021 at 1:28 PM Li Feng <fengli(a)smartx.com> wrote:Hi, I just tested the master code, the passt just exited without any error. ``` $ ./passt Outbound interface: ens192 ARP: address: 00:50:56:be:9d:1f DHCP: assign: 192.168.64.217 mask: 255.255.240.0 router: 192.168.64.1 search: . UNIX domain socket bound at /tmp/passt_1.socket You can now start qrap: ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio or directly qemu, patched with: qemu/0001-net-Allow-also-UNIX-domain-sockets-to-be-used-as-net.patch as follows: kvm ... -net socket,connect=/tmp/passt_1.socket -net nic,model=virtio 21-10-26 13:23:43 root@192.168.64.217:~/passt(master✗) ``` Another terminal: ``` $ bash x.sh recv: Connection reset by peer Probe of /tmp/passt_1.socket failed connect: No such file or directory Probe of /tmp/passt_2.socket failed connect: No such file or directory Probe of /tmp/passt_3.socket failed ... ``` ``` $ cat x.sh ./qrap 5 /root/qemu-master/build/qemu-system-x86_64 -uuid 1869b108-42b3-42a7-852e-70261d73f6a9 -name guest=1869b108-42b3-42a7-852e-70261d73f6a9 -cpu host -enable-kvm -smp 4 -device pci-bridge,chassis_nr=1,id=pci.0 -device pci-bridge,chassis_nr=1,id=pci.1 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x10 -fsdev local,security_model=mapped,id=fsdev-fs0,path=/root -device virtio-9p-pci,id=fs0,fsdev=fsdev-fs0,mount_tag=fs0 -device intel-iommu,device-iotlb=on,intremap=on -M q35,accel=kvm,kernel-irqchip=split -serial mon:stdio -nographic -m size=2048M,maxmem=32G,slots=128 -device virtio-balloon -drive file=/root/fedora34.qcow2,format=qcow2,cache=none,aio=native,if=none,id=drive-virtio-disk1,file.locking=off -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1,bootindex=1 -vnc 0.0.0.0:101 -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-222-29ec19c9-d330-4949-b/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -qmp tcp:0.0.0.0:2235,server,nowait -device virtio-serial-pci,id=virtio-serial0,max_ports=16 -net socket,fd=5 -net nic,model=virtio ``` Thanks, Feng Li
Hi Feng Li, On Thu, 28 Oct 2021 12:25:29 +0800 Li Feng <fengli(a)smartx.com> wrote:Add cc.Sorry, I missed your email. It looks like on Mailman3, if I'm the owner of a list, I'm not automatically a member receiving posts from the list itself -- added myself as member, too.On Tue, Oct 26, 2021 at 1:28 PM Li Feng <fengli(a)smartx.com> wrote: > > Hi, > I just tested the master code, the passt just exited without any error. > > ``` > $ ./passt > Outbound interface: ens192 > ARP: > address: 00:50:56:be:9d:1f > DHCP: > assign: 192.168.64.217 > mask: 255.255.240.0 > router: 192.168.64.1 > search: > . > UNIX domain socket bound at /tmp/passt_1.socket > > You can now start qrap: > ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio > or directly qemu, patched with: > qemu/0001-net-Allow-also-UNIX-domain-sockets-to-be-used-as-net.patch > as follows: > kvm ... -net socket,connect=/tmp/passt_1.socket -net nic,model=virtio > 21-10-26 13:23:43 root@192.168.64.217:~/passt(master✗)passt will fork into background as soon as it gets a connection. Can you retry running it as: ./passt -f -d so we can see if something strange is going on (-f stands for --foreground, -d for --debug)? Because:> ``` > Another terminal: > ``` > $ bash x.sh > recv: Connection reset by peer > Probe of /tmp/passt_1.socket failed > connect: No such file or directory > Probe of /tmp/passt_2.socket failed > connect: No such file or directory > Probe of /tmp/passt_3.socket failed > ... > ```...well, this shouldn't happen. It's probing /tmp/passt_1.socket and failing to get an answer. This just happened to another user, in that case seccomp was terminating passt because on his system daemon() called a different set of syscalls compared to my system: https://passt.top/passt/commit/?id=1fc6416cf9446cbf75818fd61772610e74709065 and I expect some more issues like these at the beginning, I didn't test it on different distributions yet. By the way, I'm working on adding tests for a few distributions right now, so that we can catch those early. -- Stefano
Hi Stefano, The previous test has included the fix. This is my repo HEAD: * 2c7431f - (HEAD -> master, origin/master, origin/HEAD) README: Feature list, links to lists, bugs, chat (6 days ago) <Stefano Brivio> * a77c5ef - README, perf_report: Markdown and CSS fixes (7 days ago) <Stefano Brivio> * 94c7c1d - slirp4netns.sh: Fix up usage, exit 0 on --help (7 days ago) <Stefano Brivio> * 1fc6416 - seccomp: Add newfstatat to list of allowed syscalls (7 days ago) <Stefano Brivio> * d36e429 - netlink: Fix length of address attribute (7 days ago) <Stefano Brivio> My OS is Fedora 35, x64 version. I will try to dig it when I have some time. Thanks, Feng Li On Thu, Oct 28, 2021 at 3:30 PM Stefano Brivio <sbrivio(a)redhat.com> wrote:Hi Feng Li, On Thu, 28 Oct 2021 12:25:29 +0800 Li Feng <fengli(a)smartx.com> wrote:Add cc.Sorry, I missed your email. It looks like on Mailman3, if I'm the owner of a list, I'm not automatically a member receiving posts from the list itself -- added myself as member, too.On Tue, Oct 26, 2021 at 1:28 PM Li Feng <fengli(a)smartx.com> wrote: > > Hi, > I just tested the master code, the passt just exited without any error. > > ``` > $ ./passt > Outbound interface: ens192 > ARP: > address: 00:50:56:be:9d:1f > DHCP: > assign: 192.168.64.217 > mask: 255.255.240.0 > router: 192.168.64.1 > search: > . > UNIX domain socket bound at /tmp/passt_1.socket > > You can now start qrap: > ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio > or directly qemu, patched with: > qemu/0001-net-Allow-also-UNIX-domain-sockets-to-be-used-as-net.patch > as follows: > kvm ... -net socket,connect=/tmp/passt_1.socket -net nic,model=virtio > 21-10-26 13:23:43 root@192.168.64.217:~/passt(master✗)passt will fork into background as soon as it gets a connection. Can you retry running it as: ./passt -f -d so we can see if something strange is going on (-f stands for --foreground, -d for --debug)? Because:> ``` > Another terminal: > ``` > $ bash x.sh > recv: Connection reset by peer > Probe of /tmp/passt_1.socket failed > connect: No such file or directory > Probe of /tmp/passt_2.socket failed > connect: No such file or directory > Probe of /tmp/passt_3.socket failed > ... > ```...well, this shouldn't happen. It's probing /tmp/passt_1.socket and failing to get an answer. This just happened to another user, in that case seccomp was terminating passt because on his system daemon() called a different set of syscalls compared to my system: https://passt.top/passt/commit/?id=1fc6416cf9446cbf75818fd61772610e74709065 and I expect some more issues like these at the beginning, I didn't test it on different distributions yet. By the way, I'm working on adding tests for a few distributions right now, so that we can catch those early. -- Stefano
Hi Stefano, I got the coredump file, it reports the `fork` syscall is bad: Program terminated with signal SIGSYS, Bad system call. #0 __GI__Fork () at ../sysdeps/nptl/_Fork.c:50 50 return pid; (gdb) bt #0 __GI__Fork () at ../sysdeps/nptl/_Fork.c:50 #1 0x00007f8c04fdc02a in __libc_fork () at fork.c:73 #2 0x00007f8c05009f8b in daemon (nochdir=0, noclose=0) at daemon.c:48 #3 0x000000000040c1e9 in main (argc=1, argv=0x7ffd10b5cd78) at passt.c:368 quit) Looks like the seccomp is still badly configured. I have little knowledge about the seccomp. I have tried to add the fork like this, but it doesn't work. --- a/passt.c +++ b/passt.c @@ -278,7 +278,7 @@ static void pid_file(struct ctx *c) { * #syscalls prlimit64 epoll_ctl epoll_create1 epoll_wait accept4 accept listen * #syscalls socket bind connect getsockopt setsockopt recvfrom sendto shutdown * #syscalls openat fstat fcntl lseek clone setsid exit_group getpid - * #syscalls clock_gettime newfstatat + * #syscalls clock_gettime newfstatat fork * #syscalls:pasta rt_sigreturn */ int main(int argc, char **argv) Thanks, Feng Li On Fri, Oct 29, 2021 at 11:33 AM Li Feng <fengli(a)smartx.com> wrote:Hi Stefano, The previous test has included the fix. This is my repo HEAD: * 2c7431f - (HEAD -> master, origin/master, origin/HEAD) README: Feature list, links to lists, bugs, chat (6 days ago) <Stefano Brivio> * a77c5ef - README, perf_report: Markdown and CSS fixes (7 days ago) <Stefano Brivio> * 94c7c1d - slirp4netns.sh: Fix up usage, exit 0 on --help (7 days ago) <Stefano Brivio> * 1fc6416 - seccomp: Add newfstatat to list of allowed syscalls (7 days ago) <Stefano Brivio> * d36e429 - netlink: Fix length of address attribute (7 days ago) <Stefano Brivio> My OS is Fedora 35, x64 version. I will try to dig it when I have some time. Thanks, Feng Li On Thu, Oct 28, 2021 at 3:30 PM Stefano Brivio <sbrivio(a)redhat.com> wrote: > > Hi Feng Li, > > On Thu, 28 Oct 2021 12:25:29 +0800 > Li Feng <fengli(a)smartx.com> wrote: > > > Add cc. > > Sorry, I missed your email. It looks like on Mailman3, if I'm the owner > of a list, I'm not automatically a member receiving posts from the list > itself -- added myself as member, too. > > > On Tue, Oct 26, 2021 at 1:28 PM Li Feng <fengli(a)smartx.com> wrote: > > > > > > Hi, > > > I just tested the master code, the passt just exited without any error. > > > > > > ``` > > > $ ./passt > > > Outbound interface: ens192 > > > ARP: > > > address: 00:50:56:be:9d:1f > > > DHCP: > > > assign: 192.168.64.217 > > > mask: 255.255.240.0 > > > router: 192.168.64.1 > > > search: > > > . > > > UNIX domain socket bound at /tmp/passt_1.socket > > > > > > You can now start qrap: > > > ./qrap 5 kvm ... -net socket,fd=5 -net nic,model=virtio > > > or directly qemu, patched with: > > > qemu/0001-net-Allow-also-UNIX-domain-sockets-to-be-used-as-net.patch > > > as follows: > > > kvm ... -net socket,connect=/tmp/passt_1.socket -net nic,model=virtio > > > 21-10-26 13:23:43 root@192.168.64.217:~/passt(master✗) > > passt will fork into background as soon as it gets a connection. Can > you retry running it as: > > ./passt -f -d > > so we can see if something strange is going on (-f stands for > --foreground, -d for --debug)? Because: > > > > ``` > > > Another terminal: > > > ``` > > > $ bash x.sh > > > recv: Connection reset by peer > > > Probe of /tmp/passt_1.socket failed > > > connect: No such file or directory > > > Probe of /tmp/passt_2.socket failed > > > connect: No such file or directory > > > Probe of /tmp/passt_3.socket failed > > > ... > > > ``` > > ...well, this shouldn't happen. It's probing /tmp/passt_1.socket and > failing to get an answer. > > This just happened to another user, in that case seccomp was > terminating passt because on his system daemon() called a different set > of syscalls compared to my system: > > https://passt.top/passt/commit/?id=1fc6416cf9446cbf75818fd61772610e74709065 > > and I expect some more issues like these at the beginning, I didn't > test it on different distributions yet. > > By the way, I'm working on adding tests for a few distributions right > now, so that we can catch those early. > > -- > Stefano >
Hi Feng Li, On Fri, 29 Oct 2021 13:27:44 +0800 Li Feng <fengli(a)smartx.com> wrote:Hi Stefano, I got the coredump file, it reports the `fork` syscall is bad: Program terminated with signal SIGSYS, Bad system call. #0 __GI__Fork () at ../sysdeps/nptl/_Fork.c:50 50 return pid; (gdb) bt #0 __GI__Fork () at ../sysdeps/nptl/_Fork.c:50 #1 0x00007f8c04fdc02a in __libc_fork () at fork.c:73 #2 0x00007f8c05009f8b in daemon (nochdir=0, noclose=0) at daemon.c:48 #3 0x000000000040c1e9 in main (argc=1, argv=0x7ffd10b5cd78) at passt.c:368 quit)That's not necessarily because of fork() -- fork() is already in the list of allowed syscalls. The signal is asynchronous, it might be received a bit before or after passt is executing what you see in gdb. This is probably another syscall triggered by daemon() in the specific glibc version (2.34-7.fc35) on your system -- I haven't tested Fedora 35 yet. An easy way to find out which one is the syscall causing this is using strace. For example, suppose I forgot to add listen() to the list of allowed syscalls: diff --git a/passt.c b/passt.c index 6436a45..43249cf 100644 --- a/passt.c +++ b/passt.c @@ -277,3 +277,3 @@ static void pid_file(struct ctx *c) { * #syscalls read write open close fork dup2 exit chdir ioctl writev syslog - * #syscalls prlimit64 epoll_ctl epoll_create1 epoll_wait accept4 accept listen + * #syscalls prlimit64 epoll_ctl epoll_create1 epoll_wait accept4 accept * #syscalls socket bind connect getsockopt setsockopt recvfrom sendto shutdown Then: $ strace ./passt [...] setsockopt(6, SOL_SOCKET, SO_SNDBUF, [1073741823], 4) = 0 getsockopt(6, SOL_SOCKET, SO_SNDBUF, [268435456], [4]) = 0 setsockopt(6, SOL_SOCKET, SO_RCVBUF, [1073741823], 4) = 0 getsockopt(6, SOL_SOCKET, SO_RCVBUF, [268435456], [4]) = 0 close(6) = 0 socket(AF_UNIX, SOCK_STREAM, 0) = 6 socket(AF_UNIX, SOCK_STREAM|SOCK_NONBLOCK, 0) = 7 connect(7, {sa_family=AF_UNIX, sun_path="/tmp/passt_1.socket"}, 110) = -1 ENOENT (No such file or directory) close(7) = 0 unlink("/tmp/passt_1.socket") = -1 ENOENT (No such file or directory) bind(6, {sa_family=AF_UNIX, sun_path="/tmp/passt_1.socket"}, 110) = 0 write(2, "UNIX domain socket bound at /tmp"..., 48UNIX domain socket bound at /tmp/passt_1.socket ) = 48 write(2, "\n", 1 ) = 1 listen(6, 0) = ? +++ killed by SIGSYS +++ Bad system call you would see that listen() is the first syscall not returning here (strace can't see a return from there). It's around daemon(), and the process might have forked already, so you should run strace with the -f option, which also traces child processes: strace -f ./passt the missing syscall should now be obvious from the output.Looks like the seccomp is still badly configured. I have little knowledge about the seccomp.Short summary: this is seccomp in filter mode (seccomp-bpf), it's a mechanism to block the system call (terminating the process, here) in case it's a syscall we didn't expect to be executed. It's a security feature: that's to avoid that an attacker, who already gained some control on the process execution, is able to potentially exploit a further vulnerability (e.g. in the kernel) by executing a particular syscall. This is a relatively famous example of it: https://reverse.put.as/2017/11/07/exploiting-cve-2017-5123/ passt implements this as a list of syscalls in code comments, those are translated by seccomp.sh into a BPF program, which is then loaded by seccomp() in passt.c. If a syscall not included in the resulting list is triggered, the kernel will terminate the process with a SYGSYS signal. However, different C libraries (on different architectures) might issue different syscalls to implement the same function (daemon(), here), and the list I made was just tested on the systems I use and the reports of a few other users, so some are surely missing right now. While adding tests for OpenSUSE and Debian, I already found a few alternative syscalls for some functions (I'll prepare a patch soon) -- I haven't started with Fedora 35 tests yet. -- Stefano
On Fri, Oct 29, 2021 at 3:44 PM Stefano Brivio <sbrivio(a)redhat.com> wrote:Hi Feng Li, On Fri, 29 Oct 2021 13:27:44 +0800 Li Feng <fengli(a)smartx.com> wrote:Thanks for the detailed explanation. I finally found out that the `qrap` was the root cause. I patched the qemu, and it works well. In VM,I got the ip 192.168.64.217, which is the same to host. This is in VM: root(a)192.168.64.217 08:42:35 /tmp $ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65520 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff altname enp0s2 inet 192.168.64.217/20 brd 192.168.79.255 scope global noprefixroute eth0 valid_lft forever preferred_lft forever inet6 fe80::9fa9:7232:8d10:96be/64 scope link noprefixroute valid_lft forever preferred_lft forever I have tested the ping and curl, it works, amazing!Hi Stefano, I got the coredump file, it reports the `fork` syscall is bad: Program terminated with signal SIGSYS, Bad system call. #0 __GI__Fork () at ../sysdeps/nptl/_Fork.c:50 50 return pid; (gdb) bt #0 __GI__Fork () at ../sysdeps/nptl/_Fork.c:50 #1 0x00007f8c04fdc02a in __libc_fork () at fork.c:73 #2 0x00007f8c05009f8b in daemon (nochdir=0, noclose=0) at daemon.c:48 #3 0x000000000040c1e9 in main (argc=1, argv=0x7ffd10b5cd78) at passt.c:368 quit)That's not necessarily because of fork() -- fork() is already in the list of allowed syscalls. The signal is asynchronous, it might be received a bit before or after passt is executing what you see in gdb. This is probably another syscall triggered by daemon() in the specific glibc version (2.34-7.fc35) on your system -- I haven't tested Fedora 35 yet. An easy way to find out which one is the syscall causing this is using strace. For example, suppose I forgot to add listen() to the list of allowed syscalls: diff --git a/passt.c b/passt.c index 6436a45..43249cf 100644 --- a/passt.c +++ b/passt.c @@ -277,3 +277,3 @@ static void pid_file(struct ctx *c) { * #syscalls read write open close fork dup2 exit chdir ioctl writev syslog - * #syscalls prlimit64 epoll_ctl epoll_create1 epoll_wait accept4 accept listen + * #syscalls prlimit64 epoll_ctl epoll_create1 epoll_wait accept4 accept * #syscalls socket bind connect getsockopt setsockopt recvfrom sendto shutdown Then: $ strace ./passt [...] setsockopt(6, SOL_SOCKET, SO_SNDBUF, [1073741823], 4) = 0 getsockopt(6, SOL_SOCKET, SO_SNDBUF, [268435456], [4]) = 0 setsockopt(6, SOL_SOCKET, SO_RCVBUF, [1073741823], 4) = 0 getsockopt(6, SOL_SOCKET, SO_RCVBUF, [268435456], [4]) = 0 close(6) = 0 socket(AF_UNIX, SOCK_STREAM, 0) = 6 socket(AF_UNIX, SOCK_STREAM|SOCK_NONBLOCK, 0) = 7 connect(7, {sa_family=AF_UNIX, sun_path="/tmp/passt_1.socket"}, 110) = -1 ENOENT (No such file or directory) close(7) = 0 unlink("/tmp/passt_1.socket") = -1 ENOENT (No such file or directory) bind(6, {sa_family=AF_UNIX, sun_path="/tmp/passt_1.socket"}, 110) = 0 write(2, "UNIX domain socket bound at /tmp"..., 48UNIX domain socket bound at /tmp/passt_1.socket ) = 48 write(2, "\n", 1 ) = 1 listen(6, 0) = ? +++ killed by SIGSYS +++ Bad system call you would see that listen() is the first syscall not returning here (strace can't see a return from there). It's around daemon(), and the process might have forked already, so you should run strace with the -f option, which also traces child processes: strace -f ./passt the missing syscall should now be obvious from the output.This background knowledge helped me a lot. The seccomp works well with Fedora 35 without the `qrap`. Thanks again for your great work.Looks like the seccomp is still badly configured. I have little knowledge about the seccomp.Short summary: this is seccomp in filter mode (seccomp-bpf), it's a mechanism to block the system call (terminating the process, here) in case it's a syscall we didn't expect to be executed. It's a security feature: that's to avoid that an attacker, who already gained some control on the process execution, is able to potentially exploit a further vulnerability (e.g. in the kernel) by executing a particular syscall. This is a relatively famous example of it: https://reverse.put.as/2017/11/07/exploiting-cve-2017-5123/ passt implements this as a list of syscalls in code comments, those are translated by seccomp.sh into a BPF program, which is then loaded by seccomp() in passt.c. If a syscall not included in the resulting list is triggered, the kernel will terminate the process with a SYGSYS signal. However, different C libraries (on different architectures) might issue different syscalls to implement the same function (daemon(), here), and the list I made was just tested on the systems I use and the reports of a few other users, so some are surely missing right now. While adding tests for OpenSUSE and Debian, I already found a few alternative syscalls for some functions (I'll prepare a patch soon) -- I haven't started with Fedora 35 tests yet.-- Stefano
On Fri, 29 Oct 2021 16:54:47 +0800 Li Feng <fengli(a)smartx.com> wrote:[...] Thanks for the detailed explanation. I finally found out that the `qrap` was the root cause. I patched the qemu, and it works well.Have you found out what was the offending syscall? I'll probably hit this later too, but that would help me double checking what the problem was.[...] This background knowledge helped me a lot. The seccomp works well with Fedora 35 without the `qrap`. Thanks again for your great work.You're welcome, I'm glad to hear it works for you! -- Stefano
On Fri, Oct 29, 2021 at 5:34 PM Stefano Brivio <sbrivio(a)redhat.com> wrote:On Fri, 29 Oct 2021 16:54:47 +0800 Li Feng <fengli(a)smartx.com> wrote:I made a mistake in the previous mail, using `./passt -f` works, but if running in background, it still exits without any output. This is the strace output. ``` $ strace -f ./passt ... ... accept(6, NULL, NULL) = 7 epoll_ctl(5, EPOLL_CTL_ADD, 7, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=7, u64=7}}) = 0 getrandom("\x40\xfc\xc5\x4a\x29\x3e\xdb\xcd\x25\x92\xc6\xc3\xc7\xcb\x57\x5a", 16, GRND_RANDOM) = 16 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 8 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 9 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 10 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 11 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 12 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 13 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 14 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 15 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 16 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 17 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 18 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 19 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 20 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 21 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 22 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 23 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 24 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 25 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 26 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 27 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 28 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 29 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 30 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 31 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 32 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 33 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 34 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 35 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 36 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 37 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 38 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 39 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 172939 attached , child_tidptr=0x7f7b89b1da10) = 172939 [pid 172921] exit_group(0 <unfinished ...> [pid 172939] set_robust_list(0x7f7b89b1da20, 24 <unfinished ...> [pid 172921] <... exit_group resumed>) = ? [pid 172921] +++ exited with 0 +++ <... set_robust_list resumed>) = ? +++ killed by SIGSYS (core dumped) +++ ``` Which is the bad syscall?[...] Thanks for the detailed explanation. I finally found out that the `qrap` was the root cause. I patched the qemu, and it works well.Have you found out what was the offending syscall? I'll probably hit this later too, but that would help me double checking what the problem was.[...] This background knowledge helped me a lot. The seccomp works well with Fedora 35 without the `qrap`. Thanks again for your great work.You're welcome, I'm glad to hear it works for you! -- Stefano
On Fri, 29 Oct 2021 19:02:15 +0800 Li Feng <fengli(a)smartx.com> wrote:On Fri, Oct 29, 2021 at 5:34 PM Stefano Brivio <sbrivio(a)redhat.com> wrote:Oh, it's set_robust_list(), it's normal that exit_group() doesn't return. That new usage probably comes from: https://sourceware.org/git/?p=glibc.git;a=commit;h=9a7565403758f65c07fe3705… and that code path is not really needed for passt, so I would have a quick try at avoiding it rather than adding a syscall, perhaps with a small replacement of daemon() using clone() instead of fork(). Meanwhile, this should work for you: diff --git a/passt.c b/passt.c index 6436a45..2a4ba8b 100644 --- a/passt.c +++ b/passt.c @@ -280,3 +280,3 @@ static void pid_file(struct ctx *c) { * #syscalls openat fstat fcntl lseek clone setsid exit_group getpid - * #syscalls clock_gettime newfstatat + * #syscalls clock_gettime newfstatat set_robust_list * #syscalls:pasta rt_sigreturn -- StefanoOn Fri, 29 Oct 2021 16:54:47 +0800 Li Feng <fengli(a)smartx.com> wrote:I made a mistake in the previous mail, using `./passt -f` works, but if running in background, it still exits without any output. This is the strace output. ``` $ strace -f ./passt ... ... accept(6, NULL, NULL) = 7 epoll_ctl(5, EPOLL_CTL_ADD, 7, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=7, u64=7}}) = 0 getrandom("\x40\xfc\xc5\x4a\x29\x3e\xdb\xcd\x25\x92\xc6\xc3\xc7\xcb\x57\x5a", 16, GRND_RANDOM) = 16 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 8 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 9 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 10 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 11 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 12 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 13 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 14 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 15 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 16 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 17 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 18 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 19 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 20 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 21 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 22 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 23 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 24 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 25 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 26 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 27 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 28 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 29 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 30 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 31 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 32 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 33 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 34 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 35 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 36 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 37 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 38 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 39 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 172939 attached , child_tidptr=0x7f7b89b1da10) = 172939 [pid 172921] exit_group(0 <unfinished ...> [pid 172939] set_robust_list(0x7f7b89b1da20, 24 <unfinished ...> [pid 172921] <... exit_group resumed>) = ? [pid 172921] +++ exited with 0 +++ <... set_robust_list resumed>) = ? +++ killed by SIGSYS (core dumped) +++ ``` Which is the bad syscall?[...] Thanks for the detailed explanation. I finally found out that the `qrap` was the root cause. I patched the qemu, and it works well.Have you found out what was the offending syscall? I'll probably hit this later too, but that would help me double checking what the problem was.
On Fri, Oct 29, 2021 at 7:52 PM Stefano Brivio <sbrivio(a)redhat.com> wrote:On Fri, 29 Oct 2021 19:02:15 +0800 Li Feng <fengli(a)smartx.com> wrote:It works!On Fri, Oct 29, 2021 at 5:34 PM Stefano Brivio <sbrivio(a)redhat.com> wrote:Oh, it's set_robust_list(), it's normal that exit_group() doesn't return. That new usage probably comes from: https://sourceware.org/git/?p=glibc.git;a=commit;h=9a7565403758f65c07fe3705… and that code path is not really needed for passt, so I would have a quick try at avoiding it rather than adding a syscall, perhaps with a small replacement of daemon() using clone() instead of fork(). Meanwhile, this should work for you: diff --git a/passt.c b/passt.c index 6436a45..2a4ba8b 100644 --- a/passt.c +++ b/passt.c @@ -280,3 +280,3 @@ static void pid_file(struct ctx *c) { * #syscalls openat fstat fcntl lseek clone setsid exit_group getpid - * #syscalls clock_gettime newfstatat + * #syscalls clock_gettime newfstatat set_robust_list * #syscalls:pasta rt_sigreturn -- StefanoOn Fri, 29 Oct 2021 16:54:47 +0800 Li Feng <fengli(a)smartx.com> wrote:I made a mistake in the previous mail, using `./passt -f` works, but if running in background, it still exits without any output. This is the strace output. ``` $ strace -f ./passt ... ... accept(6, NULL, NULL) = 7 epoll_ctl(5, EPOLL_CTL_ADD, 7, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=7, u64=7}}) = 0 getrandom("\x40\xfc\xc5\x4a\x29\x3e\xdb\xcd\x25\x92\xc6\xc3\xc7\xcb\x57\x5a", 16, GRND_RANDOM) = 16 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 8 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 9 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 10 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 11 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 12 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 13 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 14 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 15 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 16 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 17 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 18 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 19 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 20 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 21 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 22 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 23 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 24 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 25 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 26 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 27 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 28 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 29 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 30 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 31 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 32 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 33 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 34 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 35 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 36 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 37 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 38 socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 39 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 172939 attached , child_tidptr=0x7f7b89b1da10) = 172939 [pid 172921] exit_group(0 <unfinished ...> [pid 172939] set_robust_list(0x7f7b89b1da20, 24 <unfinished ...> [pid 172921] <... exit_group resumed>) = ? [pid 172921] +++ exited with 0 +++ <... set_robust_list resumed>) = ? +++ killed by SIGSYS (core dumped) +++ ``` Which is the bad syscall?[...] Thanks for the detailed explanation. I finally found out that the `qrap` was the root cause. I patched the qemu, and it works well.Have you found out what was the offending syscall? I'll probably hit this later too, but that would help me double checking what the problem was.
Hello again, On Fri, 29 Oct 2021 13:52:31 +0200 Stefano Brivio <sbrivio(a)redhat.com> wrote:On Fri, 29 Oct 2021 19:02:15 +0800 Li Feng <fengli(a)smartx.com> wrote:I forgot to follow up here: in the end I pushed a change that, among other calls, allows set_robust_list() unconditionally: https://passt.top/passt/commit/?id=33b1bdd079f1b40dffb040e40579d7434c28d10a alternatives looked quite awkward, a replacement of daemon() also didn't look very small, other suggestions warmly welcome. By the way, the new tests checking build and basic functionality also cover Fedora 35. -- Stefano[...] clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 172939 attached , child_tidptr=0x7f7b89b1da10) = 172939 [pid 172921] exit_group(0 <unfinished ...> [pid 172939] set_robust_list(0x7f7b89b1da20, 24 <unfinished ...> [pid 172921] <... exit_group resumed>) = ? [pid 172921] +++ exited with 0 +++ <... set_robust_list resumed>) = ? +++ killed by SIGSYS (core dumped) +++ ``` Which is the bad syscall?Oh, it's set_robust_list(), it's normal that exit_group() doesn't return. That new usage probably comes from: https://sourceware.org/git/?p=glibc.git;a=commit;h=9a7565403758f65c07fe3705… and that code path is not really needed for passt, so I would have a quick try at avoiding it rather than adding a syscall, perhaps with a small replacement of daemon() using clone() instead of fork().