With this series, I'm almost able to run the full test suite together with the new command dispatch mechanism. I still hit frequent failures in the passt_tcp performance test, so I'm not pushing this out at the moment, but as it's taking me a while, I'd rather share this earlier. Included are also two fixes for harmless (but ugly) issues discovered by Coverity. Stefano Brivio (7): test/perf: Always use /sbin/sysctl in tcp test test/perf: Switch performance test duration to 10 seconds instead of 30 tap: Check return value of accept4() before calling getsockopt() conf, tcp, udp: Arrays for ports need 2^16 values, not 2^16-8 test/lib: Restore IFS while executing directives in def blocks test/lib: Wait on iperf3 clients to be done, then send SIGINT to servers test/perf: Disable periodic throughput reports to avoid vhost hang conf.c | 2 +- tap.c | 6 ++++-- tcp.h | 4 ++-- test/lib/test | 20 ++++++++++---------- test/perf/passt_tcp | 10 +++++----- test/perf/passt_udp | 4 ++-- test/perf/pasta_tcp | 10 +++++----- test/perf/pasta_udp | 4 ++-- udp.h | 4 ++-- 9 files changed, 33 insertions(+), 31 deletions(-) -- 2.35.1
Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- test/perf/passt_tcp | 6 +++--- test/perf/pasta_tcp | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp index f0cfa1b..bee4f9e 100644 --- a/test/perf/passt_tcp +++ b/test/perf/passt_tcp @@ -30,9 +30,9 @@ guest /sbin/sysctl -w net.ipv4.tcp_rmem="4096 131072 268435456" guest /sbin/sysctl -w net.ipv4.tcp_wmem="4096 131072 268435456" guest /sbin/sysctl -w net.ipv4.tcp_timestamps=0 -ns sysctl -w net.ipv4.tcp_rmem="4096 524288 134217728" -ns sysctl -w net.ipv4.tcp_wmem="4096 524288 134217728" -ns sysctl -w net.ipv4.tcp_timestamps=0 +ns /sbin/sysctl -w net.ipv4.tcp_rmem="4096 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_wmem="4096 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_timestamps=0 gout GW ip -j -4 route show|jq -rM '.[] | select(.dst == "default").gateway' gout GW6 ip -j -6 route show|jq -rM '.[] | select(.dst == "default").gateway' diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp index 602ce52..cc21075 100644 --- a/test/perf/pasta_tcp +++ b/test/perf/pasta_tcp @@ -16,9 +16,9 @@ nstools /sbin/sysctl nproc ip seq sleep iperf3 tcp_rr tcp_crr jq sed test pasta: throughput and latency (local connections) -ns sysctl -w net.ipv4.tcp_rmem="131072 524288 134217728" -ns sysctl -w net.ipv4.tcp_wmem="131072 524288 134217728" -ns sysctl -w net.ipv4.tcp_timestamps=0 +ns /sbin/sysctl -w net.ipv4.tcp_rmem="131072 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_wmem="131072 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_timestamps=0 set THREADS 2 -- 2.35.1
On Wed, Sep 21, 2022 at 10:55:01PM +0200, Stefano Brivio wrote:Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- test/perf/passt_tcp | 6 +++--- test/perf/pasta_tcp | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp index f0cfa1b..bee4f9e 100644 --- a/test/perf/passt_tcp +++ b/test/perf/passt_tcp @@ -30,9 +30,9 @@ guest /sbin/sysctl -w net.ipv4.tcp_rmem="4096 131072 268435456" guest /sbin/sysctl -w net.ipv4.tcp_wmem="4096 131072 268435456" guest /sbin/sysctl -w net.ipv4.tcp_timestamps=0 -ns sysctl -w net.ipv4.tcp_rmem="4096 524288 134217728" -ns sysctl -w net.ipv4.tcp_wmem="4096 524288 134217728" -ns sysctl -w net.ipv4.tcp_timestamps=0 +ns /sbin/sysctl -w net.ipv4.tcp_rmem="4096 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_wmem="4096 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_timestamps=0 gout GW ip -j -4 route show|jq -rM '.[] | select(.dst == "default").gateway' gout GW6 ip -j -6 route show|jq -rM '.[] | select(.dst == "default").gateway' diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp index 602ce52..cc21075 100644 --- a/test/perf/pasta_tcp +++ b/test/perf/pasta_tcp @@ -16,9 +16,9 @@ nstools /sbin/sysctl nproc ip seq sleep iperf3 tcp_rr tcp_crr jq sed test pasta: throughput and latency (local connections) -ns sysctl -w net.ipv4.tcp_rmem="131072 524288 134217728" -ns sysctl -w net.ipv4.tcp_wmem="131072 524288 134217728" -ns sysctl -w net.ipv4.tcp_timestamps=0 +ns /sbin/sysctl -w net.ipv4.tcp_rmem="131072 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_wmem="131072 524288 134217728" +ns /sbin/sysctl -w net.ipv4.tcp_timestamps=0 set THREADS 2-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
It looks like the workaround for the virtio_net TX hang issue is working less reliably with the new command dispatch mechanism, I'm not sure why. Switch to 10 seconds, at least for the moment. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- test/perf/passt_tcp | 2 +- test/perf/passt_udp | 2 +- test/perf/pasta_tcp | 2 +- test/perf/pasta_udp | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp index bee4f9e..5f0aa3a 100644 --- a/test/perf/passt_tcp +++ b/test/perf/passt_tcp @@ -44,7 +44,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 1 set STREAMS 8 -set TIME 30 +set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l set OPTS -Z -P __STREAMS__ -l 1M -i1 -O__OMIT__ --pacing-timer 1000000 diff --git a/test/perf/passt_udp b/test/perf/passt_udp index 80731d1..6bd86ff 100644 --- a/test/perf/passt_udp +++ b/test/perf/passt_udp @@ -37,7 +37,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 4 set STREAMS 1 -set TIME 30 +set TIME 10 set OPTS -u -i1 -P __STREAMS__ --pacing-timer 1000 info Throughput in Gbps, latency in µs, __THREADS__ threads at __FREQ__ GHz, one stream each diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp index cc21075..44c5e54 100644 --- a/test/perf/pasta_tcp +++ b/test/perf/pasta_tcp @@ -23,7 +23,7 @@ ns /sbin/sysctl -w net.ipv4.tcp_timestamps=0 set THREADS 2 set STREAMS 2 -set TIME 30 +set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l set OPTS -Z -w 4M -l 1M -P __STREAMS__ -i1 -O__OMIT__ --pacing-timer 10000 diff --git a/test/perf/pasta_udp b/test/perf/pasta_udp index ae898b1..abb88b0 100644 --- a/test/perf/pasta_udp +++ b/test/perf/pasta_udp @@ -22,7 +22,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 1 set STREAMS 4 -set TIME 30 +set TIME 10 set OPTS -u -i1 -P __STREAMS__ info Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams -- 2.35.1
On Wed, Sep 21, 2022 at 10:55:02PM +0200, Stefano Brivio wrote:It looks like the workaround for the virtio_net TX hang issue is working less reliably with the new command dispatch mechanism, I'm not sure why. Switch to 10 seconds, at least for the moment. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- test/perf/passt_tcp | 2 +- test/perf/passt_udp | 2 +- test/perf/pasta_tcp | 2 +- test/perf/pasta_udp | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp index bee4f9e..5f0aa3a 100644 --- a/test/perf/passt_tcp +++ b/test/perf/passt_tcp @@ -44,7 +44,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 1 set STREAMS 8 -set TIME 30 +set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l set OPTS -Z -P __STREAMS__ -l 1M -i1 -O__OMIT__ --pacing-timer 1000000 diff --git a/test/perf/passt_udp b/test/perf/passt_udp index 80731d1..6bd86ff 100644 --- a/test/perf/passt_udp +++ b/test/perf/passt_udp @@ -37,7 +37,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 4 set STREAMS 1 -set TIME 30 +set TIME 10 set OPTS -u -i1 -P __STREAMS__ --pacing-timer 1000 info Throughput in Gbps, latency in µs, __THREADS__ threads at __FREQ__ GHz, one stream each diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp index cc21075..44c5e54 100644 --- a/test/perf/pasta_tcp +++ b/test/perf/pasta_tcp @@ -23,7 +23,7 @@ ns /sbin/sysctl -w net.ipv4.tcp_timestamps=0 set THREADS 2 set STREAMS 2 -set TIME 30 +set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l set OPTS -Z -w 4M -l 1M -P __STREAMS__ -i1 -O__OMIT__ --pacing-timer 10000 diff --git a/test/perf/pasta_udp b/test/perf/pasta_udp index ae898b1..abb88b0 100644 --- a/test/perf/pasta_udp +++ b/test/perf/pasta_udp @@ -22,7 +22,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 1 set STREAMS 4 -set TIME 30 +set TIME 10 set OPTS -u -i1 -P __STREAMS__ info Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Reported by Coverity (CWE-119): Negative value used as argument to a function expecting a positive value (for example, size of buffer or allocation) and harmless, because getsockopt() would return -EBADF if the socket is -1, so we wouldn't print anything. Check if accept4() returns a valid socket before calling getsockopt() on it. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- tap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tap.c b/tap.c index 3231da7..4d7422f 100644 --- a/tap.c +++ b/tap.c @@ -872,11 +872,13 @@ static void tap_sock_unix_new(struct ctx *c) int discard = accept4(c->fd_tap_listen, NULL, NULL, SOCK_NONBLOCK); + if (discard == -1) + return; + if (!getsockopt(discard, SOL_SOCKET, SO_PEERCRED, &ucred, &len)) info("discarding connection from PID %i", ucred.pid); - if (discard != -1) - close(discard); + close(discard); return; } -- 2.35.1
On Wed, Sep 21, 2022 at 10:55:03PM +0200, Stefano Brivio wrote:Reported by Coverity (CWE-119): Negative value used as argument to a function expecting a positive value (for example, size of buffer or allocation) and harmless, because getsockopt() would return -EBADF if the socket is -1, so we wouldn't print anything. Check if accept4() returns a valid socket before calling getsockopt() on it. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- tap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tap.c b/tap.c index 3231da7..4d7422f 100644 --- a/tap.c +++ b/tap.c @@ -872,11 +872,13 @@ static void tap_sock_unix_new(struct ctx *c) int discard = accept4(c->fd_tap_listen, NULL, NULL, SOCK_NONBLOCK); + if (discard == -1) + return; + if (!getsockopt(discard, SOL_SOCKET, SO_PEERCRED, &ucred, &len)) info("discarding connection from PID %i", ucred.pid); - if (discard != -1) - close(discard); + close(discard); return; }-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Reported by David but also by Coverity (CWE-119): In conf_ports: Out-of-bounds access to a buffer ...not in practice, because the allocation size is rounded up anyway, but not nice either. Reported-by: David Gibson <david(a)gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 2 +- tcp.h | 4 ++-- udp.h | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/conf.c b/conf.c index d80233c..7ecfa1e 100644 --- a/conf.c +++ b/conf.c @@ -127,8 +127,8 @@ static int conf_ports(struct ctx *c, char optname, const char *optarg, { int start_src, end_src, start_dst, end_dst, exclude_only = 1, i, port; char addr_buf[sizeof(struct in6_addr)] = { 0 }, *addr = addr_buf; + uint8_t *map, exclude[DIV_ROUND_UP(USHRT_MAX, 8)] = { 0 }; void (*remap)(in_port_t port, in_port_t delta); - uint8_t *map, exclude[USHRT_MAX / 8] = { 0 }; char buf[BUFSIZ], *sep, *spec, *p; sa_family_t af = AF_UNSPEC; diff --git a/tcp.h b/tcp.h index 7b720c1..6431b75 100644 --- a/tcp.h +++ b/tcp.h @@ -69,9 +69,9 @@ struct tcp_ctx { uint64_t hash_secret[2]; int conn_count; int splice_conn_count; - uint8_t port_to_tap [USHRT_MAX / 8]; + uint8_t port_to_tap [DIV_ROUND_UP(USHRT_MAX, 8)]; int init_detect_ports; - uint8_t port_to_init [USHRT_MAX / 8]; + uint8_t port_to_init [DIV_ROUND_UP(USHRT_MAX, 8)]; int ns_detect_ports; struct timespec timer_run; #ifdef HAS_SND_WND diff --git a/udp.h b/udp.h index f16fe5e..8f82842 100644 --- a/udp.h +++ b/udp.h @@ -53,9 +53,9 @@ union udp_epoll_ref { * @timer_run: Timestamp of most recent timer run */ struct udp_ctx { - uint8_t port_to_tap [USHRT_MAX / 8]; + uint8_t port_to_tap [DIV_ROUND_UP(USHRT_MAX, 8)]; int init_detect_ports; - uint8_t port_to_init [USHRT_MAX / 8]; + uint8_t port_to_init [DIV_ROUND_UP(USHRT_MAX, 8)]; int ns_detect_ports; struct timespec timer_run; }; -- 2.35.1
On Wed, Sep 21, 2022 at 10:55:04PM +0200, Stefano Brivio wrote:Reported by David but also by Coverity (CWE-119): In conf_ports: Out-of-bounds access to a buffer ...not in practice, because the allocation size is rounded up anyway, but not nice either.So... this helps, but it's not enough. For starters, this is still conceptually wrong - there should be 65536 bits in the bitmap, not 65535 (USHRT_MAX). This patch just masks it by rounding up to 65536 rather than down to 65528. Alas, this is not the only buffer overrun caused by an off-by-one in port numbers: the delta_to_tap etc. global arrays in tcp.c and udp.c should also have length 65536, rather than USHRT_MAX. So should tcp_sock_init_lo and friends. There are also similar problems in icmp.c with echo request id rather than port number. Not a buffer overrun, but the USHRT_MAX in udp_remap_to_tap() and udp_remap_to_init() is also out by one (consider the case of delta==0). I have a series which makes a number of cleanups to the port mapping handling. It fixes this bug and adds typing stuff that should make it harder to make similar mistakes in future. I held off on sending, since it's currently based on a lot of my outstanding patches. I think it wouldn't be too hard to rebase directly on master, should I do that so you can fix this before going on to debug the rest of my outstanding stuff?Reported-by: David Gibson <david(a)gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 2 +- tcp.h | 4 ++-- udp.h | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/conf.c b/conf.c index d80233c..7ecfa1e 100644 --- a/conf.c +++ b/conf.c @@ -127,8 +127,8 @@ static int conf_ports(struct ctx *c, char optname, const char *optarg, { int start_src, end_src, start_dst, end_dst, exclude_only = 1, i, port; char addr_buf[sizeof(struct in6_addr)] = { 0 }, *addr = addr_buf; + uint8_t *map, exclude[DIV_ROUND_UP(USHRT_MAX, 8)] = { 0 }; void (*remap)(in_port_t port, in_port_t delta); - uint8_t *map, exclude[USHRT_MAX / 8] = { 0 }; char buf[BUFSIZ], *sep, *spec, *p; sa_family_t af = AF_UNSPEC; diff --git a/tcp.h b/tcp.h index 7b720c1..6431b75 100644 --- a/tcp.h +++ b/tcp.h @@ -69,9 +69,9 @@ struct tcp_ctx { uint64_t hash_secret[2]; int conn_count; int splice_conn_count; - uint8_t port_to_tap [USHRT_MAX / 8]; + uint8_t port_to_tap [DIV_ROUND_UP(USHRT_MAX, 8)]; int init_detect_ports; - uint8_t port_to_init [USHRT_MAX / 8]; + uint8_t port_to_init [DIV_ROUND_UP(USHRT_MAX, 8)]; int ns_detect_ports; struct timespec timer_run; #ifdef HAS_SND_WND diff --git a/udp.h b/udp.h index f16fe5e..8f82842 100644 --- a/udp.h +++ b/udp.h @@ -53,9 +53,9 @@ union udp_epoll_ref { * @timer_run: Timestamp of most recent timer run */ struct udp_ctx { - uint8_t port_to_tap [USHRT_MAX / 8]; + uint8_t port_to_tap [DIV_ROUND_UP(USHRT_MAX, 8)]; int init_detect_ports; - uint8_t port_to_init [USHRT_MAX / 8]; + uint8_t port_to_init [DIV_ROUND_UP(USHRT_MAX, 8)]; int ns_detect_ports; struct timespec timer_run; };-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, 22 Sep 2022 16:43:24 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Wed, Sep 21, 2022 at 10:55:04PM +0200, Stefano Brivio wrote:Ah, yes, I indeed wanted to have 65536 values in the bitmap (see commit title), and this does it, but there's no reason to make it implicit.Reported by David but also by Coverity (CWE-119): In conf_ports: Out-of-bounds access to a buffer ...not in practice, because the allocation size is rounded up anyway, but not nice either.So... this helps, but it's not enough. For starters, this is still conceptually wrong - there should be 65536 bits in the bitmap, not 65535 (USHRT_MAX). This patch just masks it by rounding up to 65536 rather than down to 65528.Alas, this is not the only buffer overrun caused by an off-by-one in port numbers: the delta_to_tap etc. global arrays in tcp.c and udp.c should also have length 65536, rather than USHRT_MAX. So should tcp_sock_init_lo and friends. There are also similar problems in icmp.c with echo request id rather than port number. Not a buffer overrun, but the USHRT_MAX in udp_remap_to_tap() and udp_remap_to_init() is also out by one (consider the case of delta==0). I have a series which makes a number of cleanups to the port mapping handling. It fixes this bug and adds typing stuff that should make it harder to make similar mistakes in future. I held off on sending, since it's currently based on a lot of my outstanding patches. I think it wouldn't be too hard to rebase directly on master, should I do that so you can fix this before going on to debug the rest of my outstanding stuff?If it's not too much effort for you, yes, that would be appreciated. I hope to be able to push out everything else later today, it's just one glitch still occurring in test_iperf3() -- sometimes, clients and servers terminate properly, but we don't see it for some reason. -- Stefano
On Thu, 22 Sep 2022 10:21:50 +0200 Stefano Brivio <sbrivio(a)redhat.com> wrote:On Thu, 22 Sep 2022 16:43:24 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote: [...]Actually, it looks like I got tests finally passing, and I should be pushing all the pending patches soon... perhaps you could wait before you rebase, I guess it's more convenient. -- StefanoI have a series which makes a number of cleanups to the port mapping handling. It fixes this bug and adds typing stuff that should make it harder to make similar mistakes in future. I held off on sending, since it's currently based on a lot of my outstanding patches. I think it wouldn't be too hard to rebase directly on master, should I do that so you can fix this before going on to debug the rest of my outstanding stuff?If it's not too much effort for you, yes, that would be appreciated. I hope to be able to push out everything else later today, it's just one glitch still occurring in test_iperf3() -- sometimes, clients and servers terminate properly, but we don't see it for some reason.
On Fri, Sep 23, 2022 at 01:39:47AM +0200, Stefano Brivio wrote:On Thu, 22 Sep 2022 10:21:50 +0200 Stefano Brivio <sbrivio(a)redhat.com> wrote:Will do, thanks. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonOn Thu, 22 Sep 2022 16:43:24 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote: [...]Actually, it looks like I got tests finally passing, and I should be pushing all the pending patches soon... perhaps you could wait before you rebase, I guess it's more convenient.I have a series which makes a number of cleanups to the port mapping handling. It fixes this bug and adds typing stuff that should make it harder to make similar mistakes in future. I held off on sending, since it's currently based on a lot of my outstanding patches. I think it wouldn't be too hard to rebase directly on master, should I do that so you can fix this before going on to debug the rest of my outstanding stuff?If it's not too much effort for you, yes, that would be appreciated. I hope to be able to push out everything else later today, it's just one glitch still occurring in test_iperf3() -- sometimes, clients and servers terminate properly, but we don't see it for some reason.
If we don't, guest command dispatch will fail altogether, given that we use cat(1) on the enter file, which contains spaces. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- test/lib/test | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/lib/test b/test/lib/test index 3ad178f..a9ffe83 100755 --- a/test/lib/test +++ b/test/lib/test @@ -312,7 +312,7 @@ test_one_line() { IFS=' ' for __def_line in ${__def_body}; do - IFS= test_one_line "${__def_line}" + IFS="${__ifs}" test_one_line "${__def_line}" done IFS="${__ifs}" fi -- 2.35.1
On Wed, Sep 21, 2022 at 10:55:05PM +0200, Stefano Brivio wrote:If we don't, guest command dispatch will fail altogether, given that we use cat(1) on the enter file, which contains spaces. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Huh. I wonder how this was working for me.--- test/lib/test | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/lib/test b/test/lib/test index 3ad178f..a9ffe83 100755 --- a/test/lib/test +++ b/test/lib/test @@ -312,7 +312,7 @@ test_one_line() { IFS=' ' for __def_line in ${__def_body}; do - IFS= test_one_line "${__def_line}" + IFS="${__ifs}" test_one_line "${__def_line}" done IFS="${__ifs}" fi-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
On Thu, 22 Sep 2022 16:44:31 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:On Wed, Sep 21, 2022 at 10:55:05PM +0200, Stefano Brivio wrote:I thought you disabled that workaround, which should be the only 'def' used in non-distro tests (which are still using screen grabbing)...? -- StefanoIf we don't, guest command dispatch will fail altogether, given that we use cat(1) on the enter file, which contains spaces. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Huh. I wonder how this was working for me.
On Thu, Sep 22, 2022 at 10:25:14AM +0200, Stefano Brivio wrote:On Thu, 22 Sep 2022 16:44:31 +1000 David Gibson <david(a)gibson.dropbear.id.au> wrote:Oh, right. I missed that this was specific to the 'def' handling. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonOn Wed, Sep 21, 2022 at 10:55:05PM +0200, Stefano Brivio wrote:I thought you disabled that workaround, which should be the only 'def' used in non-distro tests (which are still using screen grabbing)...?If we don't, guest command dispatch will fail altogether, given that we use cat(1) on the enter file, which contains spaces. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Huh. I wonder how this was working for me.
An iperf3 client might fail to send the control message indicating the end of the test, if the kernel buffer doesn't accept it, and exit without having sent it, as the control socket is non-blocking. Should this happen, the server will just wait forever for this message, instead of terminating. Restore some of the behaviour that went away with the "test: Rewrite test_iperf3" patch: instead of waiting on servers to terminate, wait on the clients. When they are done, wait 2 seconds, and then send SIGINT to the servers, which make them still write out the JSON report before terminating. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- test/lib/test | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/test/lib/test b/test/lib/test index a9ffe83..b69d519 100755 --- a/test/lib/test +++ b/test/lib/test @@ -32,13 +32,11 @@ test_iperf3() { __time="${1}"; shift pane_or_context_run_bg "${__sctx}" \ - '(' \ - ' for i in $(seq 0 '${__procs}'); do' \ - ' iperf3 -s1J -p'${__port}' -i'${__time} \ - ' > s${i}.json &' \ - ' done;' \ - ' wait' \ - ')' + 'for i in $(seq 0 '${__procs}'); do' \ + ' (iperf3 -s1J -p'${__port}' -i'${__time} \ + ' > s${i}.json) &' \ + ' echo $! > s${i}.pid &' \ + 'done' \ sleep 1 # Wait for server to be ready @@ -51,7 +49,9 @@ test_iperf3() { ' wait' \ ')' - pane_or_context_wait "${__sctx}" + # If client fails to deliver control message, tell server we're done + pane_or_context_run "${__sctx}" \ + 'sleep 2; kill -INT $(cat s*.pid); rm s*.pid' __jval=".end.sum_received.bits_per_second" for __opt in ${@}; do -- 2.35.1
It appears that if we run throughput tests with one-second periodic reports, the sending side of the vhost channel used for SSH-based command dispatch occasionally stops working altogether. I haven't investigated this further, all I see is that output is truncated at some point, and doesn't resume. If we use gzip compression (ssh -C) this happens less frequently, but it still happens, seemingly indicating the issue is probably related to vhost itself. Disable periodic reports in iperf3 clients. The -i options were actually redundant, so remove them from both test files as well as from test_iperf3(). Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- test/lib/test | 2 +- test/perf/passt_tcp | 2 +- test/perf/passt_udp | 2 +- test/perf/pasta_tcp | 2 +- test/perf/pasta_udp | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/test/lib/test b/test/lib/test index b69d519..d68ade4 100755 --- a/test/lib/test +++ b/test/lib/test @@ -44,7 +44,7 @@ test_iperf3() { '(' \ ' for i in $(seq 0 '${__procs}'); do' \ ' iperf3 -c '${__dest}' -p '${__port} \ - ' -t'${__time}' -T s${i} '"${@}"' &' \ + ' -t'${__time}' -i0 -T s${i} '"${@}"' &' \ ' done;' \ ' wait' \ ')' diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp index 5f0aa3a..5ba5450 100644 --- a/test/perf/passt_tcp +++ b/test/perf/passt_tcp @@ -46,7 +46,7 @@ set THREADS 1 set STREAMS 8 set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l -set OPTS -Z -P __STREAMS__ -l 1M -i1 -O__OMIT__ --pacing-timer 1000000 +set OPTS -Z -P __STREAMS__ -l 1M -O__OMIT__ --pacing-timer 1000000 info Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams report passt tcp __THREADS__ __FREQ__ diff --git a/test/perf/passt_udp b/test/perf/passt_udp index 6bd86ff..fd2ddc1 100644 --- a/test/perf/passt_udp +++ b/test/perf/passt_udp @@ -38,7 +38,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 4 set STREAMS 1 set TIME 10 -set OPTS -u -i1 -P __STREAMS__ --pacing-timer 1000 +set OPTS -u -P __STREAMS__ --pacing-timer 1000 info Throughput in Gbps, latency in µs, __THREADS__ threads at __FREQ__ GHz, one stream each diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp index 44c5e54..1847c83 100644 --- a/test/perf/pasta_tcp +++ b/test/perf/pasta_tcp @@ -25,7 +25,7 @@ set THREADS 2 set STREAMS 2 set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l -set OPTS -Z -w 4M -l 1M -P __STREAMS__ -i1 -O__OMIT__ --pacing-timer 10000 +set OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__ --pacing-timer 10000 hout FREQ_PROCFS (echo "scale=1"; sed -n 's/cpu MHz.*: \([0-9]*\)\..*$/(\1+10^2\/2)\/10^3/p' /proc/cpuinfo) | bc -l | head -n1 hout FREQ_CPUFREQ (echo "scale=1"; printf '( %i + 10^5 / 2 ) / 10^6\n' $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq) ) | bc -l diff --git a/test/perf/pasta_udp b/test/perf/pasta_udp index abb88b0..27ea724 100644 --- a/test/perf/pasta_udp +++ b/test/perf/pasta_udp @@ -23,7 +23,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 1 set STREAMS 4 set TIME 10 -set OPTS -u -i1 -P __STREAMS__ +set OPTS -u -P __STREAMS__ info Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams -- 2.35.1
On Wed, Sep 21, 2022 at 10:55:07PM +0200, Stefano Brivio wrote:It appears that if we run throughput tests with one-second periodic reports, the sending side of the vhost channel used for SSH-based command dispatch occasionally stops working altogether. I haven't investigated this further, all I see is that output is truncated at some point, and doesn't resume.Huh. I don't think I ever observed this.If we use gzip compression (ssh -C) this happens less frequently, but it still happens, seemingly indicating the issue is probably related to vhost itself. Disable periodic reports in iperf3 clients. The -i options were actually redundant, so remove them from both test files as well as from test_iperf3(). Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- test/lib/test | 2 +- test/perf/passt_tcp | 2 +- test/perf/passt_udp | 2 +- test/perf/pasta_tcp | 2 +- test/perf/pasta_udp | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/test/lib/test b/test/lib/test index b69d519..d68ade4 100755 --- a/test/lib/test +++ b/test/lib/test @@ -44,7 +44,7 @@ test_iperf3() { '(' \ ' for i in $(seq 0 '${__procs}'); do' \ ' iperf3 -c '${__dest}' -p '${__port} \ - ' -t'${__time}' -T s${i} '"${@}"' &' \ + ' -t'${__time}' -i0 -T s${i} '"${@}"' &' \ ' done;' \ ' wait' \ ')' diff --git a/test/perf/passt_tcp b/test/perf/passt_tcp index 5f0aa3a..5ba5450 100644 --- a/test/perf/passt_tcp +++ b/test/perf/passt_tcp @@ -46,7 +46,7 @@ set THREADS 1 set STREAMS 8 set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l -set OPTS -Z -P __STREAMS__ -l 1M -i1 -O__OMIT__ --pacing-timer 1000000 +set OPTS -Z -P __STREAMS__ -l 1M -O__OMIT__ --pacing-timer 1000000 info Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams report passt tcp __THREADS__ __FREQ__ diff --git a/test/perf/passt_udp b/test/perf/passt_udp index 6bd86ff..fd2ddc1 100644 --- a/test/perf/passt_udp +++ b/test/perf/passt_udp @@ -38,7 +38,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 4 set STREAMS 1 set TIME 10 -set OPTS -u -i1 -P __STREAMS__ --pacing-timer 1000 +set OPTS -u -P __STREAMS__ --pacing-timer 1000 info Throughput in Gbps, latency in µs, __THREADS__ threads at __FREQ__ GHz, one stream each diff --git a/test/perf/pasta_tcp b/test/perf/pasta_tcp index 44c5e54..1847c83 100644 --- a/test/perf/pasta_tcp +++ b/test/perf/pasta_tcp @@ -25,7 +25,7 @@ set THREADS 2 set STREAMS 2 set TIME 10 hout OMIT echo __TIME__ / 6 | bc -l -set OPTS -Z -w 4M -l 1M -P __STREAMS__ -i1 -O__OMIT__ --pacing-timer 10000 +set OPTS -Z -w 4M -l 1M -P __STREAMS__ -O__OMIT__ --pacing-timer 10000 hout FREQ_PROCFS (echo "scale=1"; sed -n 's/cpu MHz.*: \([0-9]*\)\..*$/(\1+10^2\/2)\/10^3/p' /proc/cpuinfo) | bc -l | head -n1 hout FREQ_CPUFREQ (echo "scale=1"; printf '( %i + 10^5 / 2 ) / 10^6\n' $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq) ) | bc -l diff --git a/test/perf/pasta_udp b/test/perf/pasta_udp index abb88b0..27ea724 100644 --- a/test/perf/pasta_udp +++ b/test/perf/pasta_udp @@ -23,7 +23,7 @@ hout FREQ [ -n "__FREQ_CPUFREQ__" ] && echo __FREQ_CPUFREQ__ || echo __FREQ_PROC set THREADS 1 set STREAMS 4 set TIME 10 -set OPTS -u -i1 -P __STREAMS__ +set OPTS -u -P __STREAMS__ info Throughput in Gbps, latency in µs, one thread at __FREQ__ GHz, __STREAMS__ streams-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson