[PATCH 0/2] Adding SO_PEEK_OFF for TCPv6
From: Jon Maloy
From: Jon Maloy
On Sat, Aug 24, 2024 at 5:19 AM
From: Jon Maloy
When we added the SO_PEEK_OFF socket option to TCP we forgot to add it even for TCP on IPv6.
Even though you said "we forgot", I'm not sure if this patch series belongs to net-next material...
We do that here.
Fixes: 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option") Reviewed-by: David Gibson
Reviewed-by: Stefano Brivio Tested-by: Stefano Brivio Signed-off-by: Jon Maloy
Reviewed-by: Jason Xing
On 2024-08-23 19:51, Jason Xing wrote:
On Sat, Aug 24, 2024 at 5:19 AM
wrote: From: Jon Maloy
When we added the SO_PEEK_OFF socket option to TCP we forgot to add it even for TCP on IPv6. Even though you said "we forgot", I'm not sure if this patch series belongs to net-next material... I agree regarding the fix. The selftest is new however, and belongs in net-next because of that. Since I made it one series I decided for net-next, but I leave it to the discretion of Jakub and the maintainer of the linux-kselftest system to decide where they go.
///jon
We do that here.
Fixes: 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option") Reviewed-by: David Gibson
Reviewed-by: Stefano Brivio Tested-by: Stefano Brivio Signed-off-by: Jon Maloy Reviewed-by: Jason Xing You seem to forget to carry Eric's Reviewed-by tag, please see the link :) https://lore.kernel.org/all/CANn89iJmdGdAN1OZEfoM2LNVOewkYDuPXObRoNWhGyn93P=...
Thanks, Jason
On Fri, Aug 23, 2024 at 11:19 PM
From: Jon Maloy
When we added the SO_PEEK_OFF socket option to TCP we forgot to add it even for TCP on IPv6.
We do that here.
Fixes: 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option") Reviewed-by: David Gibson
Reviewed-by: Stefano Brivio Tested-by: Stefano Brivio Signed-off-by: Jon Maloy
Reviewed-by: Eric Dumazet
From: Jon Maloy
Hello Jon,
On Sat, Aug 24, 2024 at 5:19 AM
From: Jon Maloy
We add a selftest to check that the new feature added in commit 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option") works correctly.
Reviewed-by: Stefano Brivio
Tested-by: Stefano Brivio Signed-off-by: Jon Maloy
Thanks for working on this. Sorry that I just noticed I missed your previous reply :(
--- tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/tcp_so_peek_off.c | 181 ++++++++++++++++++ 2 files changed, 182 insertions(+) create mode 100644 tools/testing/selftests/net/tcp_so_peek_off.c
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 8eaffd7a641c..1179e3261bef 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -80,6 +80,7 @@ TEST_PROGS += io_uring_zerocopy_tx.sh TEST_GEN_FILES += bind_bhash TEST_GEN_PROGS += sk_bind_sendto_listen TEST_GEN_PROGS += sk_connect_zero_addr +TEST_GEN_PROGS += tcp_so_peek_off TEST_PROGS += test_ingress_egress_chaining.sh TEST_GEN_PROGS += so_incoming_cpu TEST_PROGS += sctp_vrf.sh diff --git a/tools/testing/selftests/net/tcp_so_peek_off.c b/tools/testing/selftests/net/tcp_so_peek_off.c new file mode 100644 index 000000000000..8379ea02e3d7 --- /dev/null +++ b/tools/testing/selftests/net/tcp_so_peek_off.c @@ -0,0 +1,181 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include
+#include +#include +#include +#include +#include +#include +#include +#include "../kselftest.h" + +static char *afstr(int af) +{ + return af == AF_INET ? "TCP/IPv4" : "TCP/IPv6"; +} + +int tcp_peek_offset_probe(sa_family_t af) +{ + int optv = 0; + int ret = 0; + int s; + + s = socket(af, SOCK_STREAM | SOCK_CLOEXEC, IPPROTO_TCP); + if (s < 0) { + ksft_perror("Temporary TCP socket creation failed"); + } else { + if (!setsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &optv, sizeof(int))) + ret = 1; + else + printf("%s does not support SO_PEEK_OFF\n", afstr(af)); + close(s); + } + return ret; +} + +static void tcp_peek_offset_set(int s, int offset) +{ + if (setsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &offset, sizeof(offset))) + ksft_perror("Failed to set SO_PEEK_OFF value\n"); +} + +static int tcp_peek_offset_get(int s) +{ + int offset; + socklen_t len = sizeof(offset); + + if (getsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &offset, &len)) + ksft_perror("Failed to get SO_PEEK_OFF value\n"); + return offset; +} + +static int tcp_peek_offset_test(sa_family_t af) +{ + union { + struct sockaddr sa; + struct sockaddr_in a4; + struct sockaddr_in6 a6; + } a; + int res = 0; + int s[2] = {0, 0}; + int recv_sock = 0; + int offset = 0; + ssize_t len; + char buf; + + memset(&a, 0, sizeof(a)); + a.sa.sa_family = af; + + s[0] = socket(af, SOCK_STREAM, IPPROTO_TCP); + s[1] = socket(af, SOCK_STREAM | SOCK_NONBLOCK, IPPROTO_TCP); + + if (s[0] < 0 || s[1] < 0) { + ksft_perror("Temporary probe socket creation failed\n"); + goto out;
Nit: I wonder if we can use more proper test statements to avoid such hiding failure[1] when closing a invalid file descriptor, even though it doesn't harm the test itself? [1]: "EBADF (Bad file descriptor)"
+ } + if (bind(s[0], &a.sa, sizeof(a)) < 0) { + ksft_perror("Temporary probe socket bind() failed\n"); + goto out; + } + if (getsockname(s[0], &a.sa, &((socklen_t) { sizeof(a) })) < 0) { + ksft_perror("Temporary probe socket getsockname() failed\n"); + goto out; + } + if (listen(s[0], 0) < 0) { + ksft_perror("Temporary probe socket listen() failed\n"); + goto out; + } + if (connect(s[1], &a.sa, sizeof(a)) >= 0 || errno != EINPROGRESS) { + ksft_perror("Temporary probe socket connect() failed\n"); + goto out; + } + recv_sock = accept(s[0], NULL, NULL); + if (recv_sock <= 0) { + ksft_perror("Temporary probe socket accept() failed\n"); + goto out;
Same here.
+ } + + /* Some basic tests of getting/setting offset */ + offset = tcp_peek_offset_get(recv_sock); + if (offset != -1) { + ksft_perror("Initial value of socket offset not -1\n"); + goto out; + } + tcp_peek_offset_set(recv_sock, 0); + offset = tcp_peek_offset_get(recv_sock); + if (offset != 0) { + ksft_perror("Failed to set socket offset to 0\n"); + goto out; + } + + /* Transfer a message */ + if (send(s[1], (char *)("ab"), 2, 0) <= 0 || errno != EINPROGRESS) { + ksft_perror("Temporary probe socket send() failed\n"); + goto out; + } + /* Read first byte */ + len = recv(recv_sock, &buf, 1, MSG_PEEK); + if (len != 1 || buf != 'a') { + ksft_perror("Failed to read first byte of message\n"); + goto out; + } + offset = tcp_peek_offset_get(recv_sock); + if (offset != 1) { + ksft_perror("Offset not forwarded correctly at first byte\n"); + goto out; + } + /* Try to read beyond last byte */ + len = recv(recv_sock, &buf, 2, MSG_PEEK); + if (len != 1 || buf != 'b') { + ksft_perror("Failed to read last byte of message\n"); + goto out; + } + offset = tcp_peek_offset_get(recv_sock); + if (offset != 2) { + ksft_perror("Offset not forwarded correctly at last byte\n"); + goto out; + } + /* Flush message */ + len = recv(recv_sock, NULL, 2, MSG_TRUNC); + if (len != 2) { + ksft_perror("Failed to flush message\n"); + goto out; + } + offset = tcp_peek_offset_get(recv_sock); + if (offset != 0) { + ksft_perror("Offset not reverted correctly after flush\n"); + goto out; + } + + printf("%s with MSG_PEEK_OFF works correctly\n", afstr(af)); + res = 1; +out: + close(recv_sock); + close(s[1]); + close(s[0]); + return res; +} + +int main(void) +{ + int res4, res6; + + res4 = tcp_peek_offset_probe(AF_INET); + res6 = tcp_peek_offset_probe(AF_INET6); + + if (!res4 && !res6) + return KSFT_SKIP; + + if (res4) + res4 = tcp_peek_offset_test(AF_INET); + + if (res6) + res6 = tcp_peek_offset_test(AF_INET6); + + if (!res4 || !res6)
What if res6 is NULL after checking tcp_peek_offset_probe() while res4 is always working correctly, then we will get notified with a KSFT_FAIL failure instead of KSFT_SKIP. The thing could happen because you reuse the same return value for v4/v6 mode. Thanks, Jason
+ return KSFT_FAIL; + + return KSFT_PASS; +} + -- 2.45.2
On 2024-08-23 19:44, Jason Xing wrote:
Hello Jon,
From: Jon Maloy
We add a selftest to check that the new feature added in commit 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option") works correctly.
Reviewed-by: Stefano Brivio
Tested-by: Stefano Brivio Signed-off-by: Jon Maloy Thanks for working on this. Sorry that I just noticed I missed your On Sat, Aug 24, 2024 at 5:19 AM
wrote: previous reply :( There is still the ditto UDP selftest to be done ;-) --- tools/testing/selftests/net/Makefile | 1 + tools/testing/selftests/net/tcp_so_peek_off.c | 181 ++++++++++++++++++ 2 files changed, 182 insertions(+) create mode 100644 tools/testing/selftests/net/tcp_so_peek_off.c
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 8eaffd7a641c..1179e3261bef 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -80,6 +80,7 @@ TEST_PROGS += io_uring_zerocopy_tx.sh TEST_GEN_FILES += bind_bhash TEST_GEN_PROGS += sk_bind_sendto_listen TEST_GEN_PROGS += sk_connect_zero_addr +TEST_GEN_PROGS += tcp_so_peek_off TEST_PROGS += test_ingress_egress_chaining.sh TEST_GEN_PROGS += so_incoming_cpu TEST_PROGS += sctp_vrf.sh diff --git a/tools/testing/selftests/net/tcp_so_peek_off.c b/tools/testing/selftests/net/tcp_so_peek_off.c new file mode 100644 index 000000000000..8379ea02e3d7 --- /dev/null +++ b/tools/testing/selftests/net/tcp_so_peek_off.c @@ -0,0 +1,181 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include
+#include +#include +#include +#include +#include +#include +#include +#include "../kselftest.h" + +static char *afstr(int af) +{ + return af == AF_INET ? "TCP/IPv4" : "TCP/IPv6"; +} + +int tcp_peek_offset_probe(sa_family_t af) +{ + int optv = 0; + int ret = 0; + int s; + + s = socket(af, SOCK_STREAM | SOCK_CLOEXEC, IPPROTO_TCP); + if (s < 0) { + ksft_perror("Temporary TCP socket creation failed"); + } else { + if (!setsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &optv, sizeof(int))) + ret = 1; + else + printf("%s does not support SO_PEEK_OFF\n", afstr(af)); + close(s); + } + return ret; +} + +static void tcp_peek_offset_set(int s, int offset) +{ + if (setsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &offset, sizeof(offset))) + ksft_perror("Failed to set SO_PEEK_OFF value\n"); +} + +static int tcp_peek_offset_get(int s) +{ + int offset; + socklen_t len = sizeof(offset); + + if (getsockopt(s, SOL_SOCKET, SO_PEEK_OFF, &offset, &len)) + ksft_perror("Failed to get SO_PEEK_OFF value\n"); + return offset; +} + +static int tcp_peek_offset_test(sa_family_t af) +{ + union { + struct sockaddr sa; + struct sockaddr_in a4; + struct sockaddr_in6 a6; + } a; + int res = 0; + int s[2] = {0, 0}; + int recv_sock = 0; + int offset = 0; + ssize_t len; + char buf; + + memset(&a, 0, sizeof(a)); + a.sa.sa_family = af; + + s[0] = socket(af, SOCK_STREAM, IPPROTO_TCP); + s[1] = socket(af, SOCK_STREAM | SOCK_NONBLOCK, IPPROTO_TCP); + + if (s[0] < 0 || s[1] < 0) { + ksft_perror("Temporary probe socket creation failed\n"); + goto out; Nit: I wonder if we can use more proper test statements to avoid such hiding failure[1] when closing a invalid file descriptor, even though it doesn't harm the test itself? [1]: "EBADF (Bad file descriptor)" Fixed that in v2.
+ } + if (bind(s[0], &a.sa, sizeof(a)) < 0) { + ksft_perror("Temporary probe socket bind() failed\n"); + goto out; + } + if (getsockname(s[0], &a.sa, &((socklen_t) { sizeof(a) })) < 0) { + ksft_perror("Temporary probe socket getsockname() failed\n"); + goto out; + } + if (listen(s[0], 0) < 0) { + ksft_perror("Temporary probe socket listen() failed\n"); + goto out; + } + if (connect(s[1], &a.sa, sizeof(a)) >= 0 || errno != EINPROGRESS) { + ksft_perror("Temporary probe socket connect() failed\n"); + goto out; + } + recv_sock = accept(s[0], NULL, NULL); + if (recv_sock <= 0) { + ksft_perror("Temporary probe socket accept() failed\n"); + goto out; Same here. Fixed. + } + + /* Some basic tests of getting/setting offset */ + offset = tcp_peek_offset_get(recv_sock); + if (offset != -1) { + ksft_perror("Initial value of socket offset not -1\n"); + goto out; + } + tcp_peek_offset_set(recv_sock, 0); + offset = tcp_peek_offset_get(recv_sock); + if (offset != 0) { + ksft_perror("Failed to set socket offset to 0\n"); + goto out; + } + + /* Transfer a message */ + if (send(s[1], (char *)("ab"), 2, 0) <= 0 || errno != EINPROGRESS) { + ksft_perror("Temporary probe socket send() failed\n"); + goto out; + } + /* Read first byte */ + len = recv(recv_sock, &buf, 1, MSG_PEEK); + if (len != 1 || buf != 'a') { + ksft_perror("Failed to read first byte of message\n"); + goto out; + } + offset = tcp_peek_offset_get(recv_sock); + if (offset != 1) { + ksft_perror("Offset not forwarded correctly at first byte\n"); + goto out; + } + /* Try to read beyond last byte */ + len = recv(recv_sock, &buf, 2, MSG_PEEK); + if (len != 1 || buf != 'b') { + ksft_perror("Failed to read last byte of message\n"); + goto out; + } + offset = tcp_peek_offset_get(recv_sock); + if (offset != 2) { + ksft_perror("Offset not forwarded correctly at last byte\n"); + goto out; + } + /* Flush message */ + len = recv(recv_sock, NULL, 2, MSG_TRUNC); + if (len != 2) { + ksft_perror("Failed to flush message\n"); + goto out; + } + offset = tcp_peek_offset_get(recv_sock); + if (offset != 0) { + ksft_perror("Offset not reverted correctly after flush\n"); + goto out; + } + + printf("%s with MSG_PEEK_OFF works correctly\n", afstr(af)); + res = 1; +out: + close(recv_sock); + close(s[1]); + close(s[0]); + return res; +} + +int main(void) +{ + int res4, res6; + + res4 = tcp_peek_offset_probe(AF_INET); + res6 = tcp_peek_offset_probe(AF_INET6); + + if (!res4 && !res6) + return KSFT_SKIP; + + if (res4) + res4 = tcp_peek_offset_test(AF_INET); + + if (res6) + res6 = tcp_peek_offset_test(AF_INET6); + + if (!res4 || !res6) What if res6 is NULL after checking tcp_peek_offset_probe() while res4 is always working correctly, then we will get notified with a KSFT_FAIL failure instead of KSFT_SKIP. This is intentional. If IPv4 is supported, and IPv6 is not, that is a failure.
Regards ///jon
The thing could happen because you reuse the same return value for v4/v6 mode.
Thanks, Jason
+ return KSFT_FAIL; + + return KSFT_PASS; +} + -- 2.45.2
Hello Jon,
On Tue, Aug 27, 2024 at 3:58 AM Jon Maloy
On 2024-08-23 19:44, Jason Xing wrote:
Hello Jon,
From: Jon Maloy
We add a selftest to check that the new feature added in commit 05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option") works correctly.
Reviewed-by: Stefano Brivio
Tested-by: Stefano Brivio Signed-off-by: Jon Maloy Thanks for working on this. Sorry that I just noticed I missed your On Sat, Aug 24, 2024 at 5:19 AM
wrote: previous reply :( There is still the ditto UDP selftest to be done ;-)
The reason why I didn't respond at that time is because I was unsure if I had enough time to finish it. Now it's time. After digging into this, there will be a lot of duplicated code if I write a new one named like "udp_so_peek_off". I think adjusting your tcp_so_peek_off.c to complete the UDP part is just fine. Of course, tcp_so_peek_off.c will be renamed :) I will post one later to see if it's reasonable... Thanks, Jason
participants (4)
-
Eric Dumazet
-
Jason Xing
-
jmaloy@redhat.com
-
Jon Maloy