Patches 1/4 to 3/4 make debugging of the issue solved in 4/4 more viable. Stefano Brivio (4): conf, passt.1: Don't imply --foreground with --debug tap: Trace received (outbound) ICMP packets in debug mode, too icmp: Add debugging messages for handled replies and requests icmp: Don't discard first reply sequence for a given echo ID conf.c | 7 +++---- icmp.c | 46 +++++++++++++++++++++++++++++++++++++++------- icmp.h | 1 + passt.1 | 5 ++--- passt.c | 3 +++ tap.c | 2 ++ 6 files changed, 50 insertions(+), 14 deletions(-) -- 2.35.1
Having -f implied by -d (and --trace) usually saves some typing, but debug mode in background (with a log file) is quite useful if pasta is started by Podman, and is probably going to be handy for passt with libvirt later, too. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- conf.c | 7 +++---- passt.1 | 5 ++--- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/conf.c b/conf.c index 598c711..90214f5 100644 --- a/conf.c +++ b/conf.c @@ -636,7 +636,7 @@ static void usage(const char *name) info(""); - info( " -d, --debug Be verbose, don't run in background"); + info( " -d, --debug Be verbose"); info( " --trace Be extra verbose, implies --debug"); info( " -q, --quiet Don't print informational messages"); info( " -f, --foreground Don't run in background"); @@ -1192,7 +1192,7 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } - c->trace = c->debug = c->foreground = 1; + c->trace = c->debug = 1; break; case 12: if (runas) { @@ -1233,7 +1233,6 @@ void conf(struct ctx *c, int argc, char **argv) } c->debug = 1; - c->foreground = 1; break; case 'e': if (logfile) { @@ -1275,7 +1274,7 @@ void conf(struct ctx *c, int argc, char **argv) c->quiet = 1; break; case 'f': - if (c->foreground && !c->debug) { + if (c->foreground) { err("Multiple --foreground options given"); usage(argv[0]); } diff --git a/passt.1 b/passt.1 index 92c4ce2..d121050 100644 --- a/passt.1 +++ b/passt.1 @@ -79,12 +79,11 @@ for performance reasons. .TP .BR \-d ", " \-\-debug -Be verbose, don't run in background, don't log to the system logger. +Be verbose, don't log to the system logger. .TP .BR \-\-trace -Be extra verbose, show single packets, don't run in background. Implies -\fB--debug\fR. +Be extra verbose, show single packets. Implies \fB--debug\fR. .TP .BR \-q ", " \-\-quiet -- 2.35.1
On Wed, Oct 26, 2022 at 06:25:28PM +0200, Stefano Brivio wrote:Having -f implied by -d (and --trace) usually saves some typing, but debug mode in background (with a log file) is quite useful if pasta is started by Podman, and is probably going to be handy for passt with libvirt later, too. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- conf.c | 7 +++---- passt.1 | 5 ++--- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/conf.c b/conf.c index 598c711..90214f5 100644 --- a/conf.c +++ b/conf.c @@ -636,7 +636,7 @@ static void usage(const char *name) info(""); - info( " -d, --debug Be verbose, don't run in background"); + info( " -d, --debug Be verbose"); info( " --trace Be extra verbose, implies --debug"); info( " -q, --quiet Don't print informational messages"); info( " -f, --foreground Don't run in background"); @@ -1192,7 +1192,7 @@ void conf(struct ctx *c, int argc, char **argv) usage(argv[0]); } - c->trace = c->debug = c->foreground = 1; + c->trace = c->debug = 1; break; case 12: if (runas) { @@ -1233,7 +1233,6 @@ void conf(struct ctx *c, int argc, char **argv) } c->debug = 1; - c->foreground = 1; break; case 'e': if (logfile) { @@ -1275,7 +1274,7 @@ void conf(struct ctx *c, int argc, char **argv) c->quiet = 1; break; case 'f': - if (c->foreground && !c->debug) { + if (c->foreground) { err("Multiple --foreground options given"); usage(argv[0]); } diff --git a/passt.1 b/passt.1 index 92c4ce2..d121050 100644 --- a/passt.1 +++ b/passt.1 @@ -79,12 +79,11 @@ for performance reasons. .TP .BR \-d ", " \-\-debug -Be verbose, don't run in background, don't log to the system logger. +Be verbose, don't log to the system logger. .TP .BR \-\-trace -Be extra verbose, show single packets, don't run in background. Implies -\fB--debug\fR. +Be extra verbose, show single packets. Implies \fB--debug\fR. .TP .BR \-q ", " \-\-quiet-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
This only worked for ICMPv6: ICMP packets have no TCP-style header, so they are handled as a special case before packet sequences are formed, and the call to tap_packet_debug() was missing. Fixes: bb708111833e ("treewide: Packet abstraction with mandatory boundary checks") Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- tap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tap.c b/tap.c index 3f78c99..4dcff4f 100644 --- a/tap.c +++ b/tap.c @@ -463,6 +463,8 @@ resume: if (c->no_icmp) continue; + tap_packet_debug(iph, NULL, NULL, 0, NULL, 1); + packet_add(pkt, l4_len, l4h); icmp_tap_handler(c, AF_INET, &iph->daddr, pkt, now); continue; -- 2.35.1
On Wed, Oct 26, 2022 at 06:25:29PM +0200, Stefano Brivio wrote:This only worked for ICMPv6: ICMP packets have no TCP-style header, so they are handled as a special case before packet sequences are formed, and the call to tap_packet_debug() was missing. Fixes: bb708111833e ("treewide: Packet abstraction with mandatory boundary checks") Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- tap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tap.c b/tap.c index 3f78c99..4dcff4f 100644 --- a/tap.c +++ b/tap.c @@ -463,6 +463,8 @@ resume: if (c->no_icmp) continue; + tap_packet_debug(iph, NULL, NULL, 0, NULL, 1); + packet_add(pkt, l4_len, l4h); icmp_tap_handler(c, AF_INET, &iph->daddr, pkt, now); continue;-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
...instead of just reporting errors. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- icmp.c | 30 +++++++++++++++++++++++++----- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/icmp.c b/icmp.c index 233acf9..9caa7e6 100644 --- a/icmp.c +++ b/icmp.c @@ -88,6 +88,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, struct icmp6hdr *ih = (struct icmp6hdr *)buf; id = ntohs(ih->icmp6_identifier); + seq = ntohs(ih->icmp6_sequence); /* If bind() fails e.g. because of a broken SELinux policy, this * might happen. Fix up the identifier to match the sent one. @@ -97,14 +98,15 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, /* In PASTA mode, we'll get any reply we send, discard them. */ if (c->mode == MODE_PASTA) { - seq = ntohs(ih->icmp6_sequence); - if (icmp_id_map[V6][id].seq == seq) return; icmp_id_map[V6][id].seq = seq; } + debug("ICMPv6: echo %s to tap, ID: %i, seq: %i", + (ih->icmp6_type == 128) ? "request" : "reply", id, seq); + tap_icmp6_send(c, &sr6->sin6_addr, tap_ip6_daddr(c, &sr6->sin6_addr), buf, n); } else { @@ -112,11 +114,12 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, struct icmphdr *ih = (struct icmphdr *)buf; id = ntohs(ih->un.echo.id); + seq = ntohs(ih->un.echo.sequence); + if (id != iref->icmp.id) ih->un.echo.id = htons(iref->icmp.id); if (c->mode == MODE_PASTA) { - seq = ntohs(ih->un.echo.sequence); if (icmp_id_map[V4][id].seq == seq) return; @@ -124,6 +127,9 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } + debug("ICMP: echo %s to tap, ID: %i, seq: %i", + (ih->type == ICMP_ECHO) ? "request" : "reply", id, seq); + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), buf, n); } @@ -175,14 +181,21 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr, } icmp_id_map[V4][id].sock = s; + + debug("ICMP: new socket %i for echo ID %i", s, id); } icmp_id_map[V4][id].ts = now->tv_sec; bitmap_set(icmp_act[V4], id); sa.sin_addr = *(struct in_addr *)addr; if (sendto(s, ih, sizeof(*ih) + plen, MSG_NOSIGNAL, - (struct sockaddr *)&sa, sizeof(sa)) < 0) + (struct sockaddr *)&sa, sizeof(sa)) < 0) { debug("ICMP: failed to relay request to socket"); + } else { + debug("ICMP: echo %s to socket, ID: %i, seq: %i", + (ih->type == ICMP_ECHO) ? "request" : "reply", + id, ntohs(ih->un.echo.sequence)); + } } else if (af == AF_INET6) { union icmp_epoll_ref iref = { .icmp.v6 = 1 }; struct sockaddr_in6 sa = { @@ -214,14 +227,21 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr, } icmp_id_map[V6][id].sock = s; + + debug("ICMPv6: new socket %i for echo ID %i", s, id); } icmp_id_map[V6][id].ts = now->tv_sec; bitmap_set(icmp_act[V6], id); sa.sin6_addr = *(struct in6_addr *)addr; if (sendto(s, ih, sizeof(*ih) + plen, MSG_NOSIGNAL, - (struct sockaddr *)&sa, sizeof(sa)) < 1) + (struct sockaddr *)&sa, sizeof(sa)) < 1) { debug("ICMPv6: failed to relay request to socket"); + } else { + debug("ICMPv6: echo %s to socket, ID: %i, seq: %i", + (ih->icmp6_type == 128) ? "request" : "reply", + id, ntohs(ih->icmp6_sequence)); + } } return 1; -- 2.35.1
On Wed, Oct 26, 2022 at 06:25:30PM +0200, Stefano Brivio wrote:...instead of just reporting errors. Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- icmp.c | 30 +++++++++++++++++++++++++----- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/icmp.c b/icmp.c index 233acf9..9caa7e6 100644 --- a/icmp.c +++ b/icmp.c @@ -88,6 +88,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, struct icmp6hdr *ih = (struct icmp6hdr *)buf; id = ntohs(ih->icmp6_identifier); + seq = ntohs(ih->icmp6_sequence); /* If bind() fails e.g. because of a broken SELinux policy, this * might happen. Fix up the identifier to match the sent one. @@ -97,14 +98,15 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, /* In PASTA mode, we'll get any reply we send, discard them. */ if (c->mode == MODE_PASTA) { - seq = ntohs(ih->icmp6_sequence); - if (icmp_id_map[V6][id].seq == seq) return; icmp_id_map[V6][id].seq = seq; } + debug("ICMPv6: echo %s to tap, ID: %i, seq: %i", + (ih->icmp6_type == 128) ? "request" : "reply", id, seq); + tap_icmp6_send(c, &sr6->sin6_addr, tap_ip6_daddr(c, &sr6->sin6_addr), buf, n); } else { @@ -112,11 +114,12 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, struct icmphdr *ih = (struct icmphdr *)buf; id = ntohs(ih->un.echo.id); + seq = ntohs(ih->un.echo.sequence); + if (id != iref->icmp.id) ih->un.echo.id = htons(iref->icmp.id); if (c->mode == MODE_PASTA) { - seq = ntohs(ih->un.echo.sequence); if (icmp_id_map[V4][id].seq == seq) return; @@ -124,6 +127,9 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, icmp_id_map[V4][id].seq = seq; } + debug("ICMP: echo %s to tap, ID: %i, seq: %i", + (ih->type == ICMP_ECHO) ? "request" : "reply", id, seq); + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), buf, n); } @@ -175,14 +181,21 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr, } icmp_id_map[V4][id].sock = s; + + debug("ICMP: new socket %i for echo ID %i", s, id); } icmp_id_map[V4][id].ts = now->tv_sec; bitmap_set(icmp_act[V4], id); sa.sin_addr = *(struct in_addr *)addr; if (sendto(s, ih, sizeof(*ih) + plen, MSG_NOSIGNAL, - (struct sockaddr *)&sa, sizeof(sa)) < 0) + (struct sockaddr *)&sa, sizeof(sa)) < 0) { debug("ICMP: failed to relay request to socket"); + } else { + debug("ICMP: echo %s to socket, ID: %i, seq: %i", + (ih->type == ICMP_ECHO) ? "request" : "reply", + id, ntohs(ih->un.echo.sequence)); + } } else if (af == AF_INET6) { union icmp_epoll_ref iref = { .icmp.v6 = 1 }; struct sockaddr_in6 sa = { @@ -214,14 +227,21 @@ int icmp_tap_handler(const struct ctx *c, int af, const void *addr, } icmp_id_map[V6][id].sock = s; + + debug("ICMPv6: new socket %i for echo ID %i", s, id); } icmp_id_map[V6][id].ts = now->tv_sec; bitmap_set(icmp_act[V6], id); sa.sin6_addr = *(struct in6_addr *)addr; if (sendto(s, ih, sizeof(*ih) + plen, MSG_NOSIGNAL, - (struct sockaddr *)&sa, sizeof(sa)) < 1) + (struct sockaddr *)&sa, sizeof(sa)) < 1) { debug("ICMPv6: failed to relay request to socket"); + } else { + debug("ICMPv6: echo %s to socket, ID: %i, seq: %i", + (ih->icmp6_type == 128) ? "request" : "reply", + id, ntohs(ih->icmp6_sequence)); + } } return 1;-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
In pasta mode, ICMP and ICMPv6 echo sockets relay back to us any reply we send: we're on the same host as the target, after all. We discard them by comparing the last sequence we sent with the sequence we receive. However, on the first reply for a given identifier, the sequence might be zero, depending on the implementation of ping(8): we need another value to indicate we haven't sent any sequence number, yet. Use -1 as initialiser in the echo identifier map. This is visible with Busybox's ping, and was reported by Paul on the integration at https://github.com/containers/podman/pull/16141, with: $ podman run --net=pasta alpine ping -c 2 192.168.188.1 ...where only the second reply would be routed back. Reported-by: Paul Holzinger <pholzing(a)redhat.com> Fixes: 33482d5bf293 ("passt: Add PASTA mode, major rework") Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com> --- icmp.c | 16 ++++++++++++++-- icmp.h | 1 + passt.c | 3 +++ 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/icmp.c b/icmp.c index 9caa7e6..4ee847f 100644 --- a/icmp.c +++ b/icmp.c @@ -44,12 +44,12 @@ /** * struct icmp_id_sock - Tracking information for single ICMP echo identifier * @sock: Bound socket for identifier - * @seq: Last sequence number sent to tap, host order + * @seq: Last sequence number sent to tap, host order, -1: not sent yet * @ts: Last associated activity from tap, seconds */ struct icmp_id_sock { int sock; - uint16_t seq; + int seq; time_t ts; }; @@ -273,6 +273,7 @@ static void icmp_timer_one(const struct ctx *c, int v6, uint16_t id, epoll_ctl(c->epollfd, EPOLL_CTL_DEL, id_map->sock, NULL); close(id_map->sock); id_map->sock = 0; + id_map->seq = -1; } /** @@ -301,3 +302,14 @@ v6: goto v6; } } + +/** + * icmp_init() - Initialise sequences in ID map to -1 (no sequence sent yet) + */ +void icmp_init(void) +{ + unsigned i; + + for (i = 0; i < ICMP_NUM_IDS; i++) + icmp_id_map[V4][i].seq = icmp_id_map[V6][i].seq = -1; +} diff --git a/icmp.h b/icmp.h index 458ce31..275486d 100644 --- a/icmp.h +++ b/icmp.h @@ -15,6 +15,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, int icmp_tap_handler(const struct ctx *c, int af, const void *addr, const struct pool *p, const struct timespec *now); void icmp_timer(const struct ctx *c, const struct timespec *ts); +void icmp_init(void); /** * union icmp_epoll_ref - epoll reference portion for ICMP tracking diff --git a/passt.c b/passt.c index ff4ee5d..34cd832 100644 --- a/passt.c +++ b/passt.c @@ -256,6 +256,9 @@ int main(int argc, char **argv) if ((!c.no_udp && udp_init(&c)) || (!c.no_tcp && tcp_init(&c))) exit(EXIT_FAILURE); + if (!c.no_icmp) + icmp_init(); + proto_update_l2_buf(c.mac_guest, c.mac, &c.ip4.addr); if (c.ifi4 && !c.no_dhcp) -- 2.35.1
On Wed, Oct 26, 2022 at 06:25:31PM +0200, Stefano Brivio wrote:In pasta mode, ICMP and ICMPv6 echo sockets relay back to us any reply we send: we're on the same host as the target, after all. We discard them by comparing the last sequence we sent with the sequence we receive. However, on the first reply for a given identifier, the sequence might be zero, depending on the implementation of ping(8): we need another value to indicate we haven't sent any sequence number, yet. Use -1 as initialiser in the echo identifier map. This is visible with Busybox's ping, and was reported by Paul on the integration at https://github.com/containers/podman/pull/16141, with: $ podman run --net=pasta alpine ping -c 2 192.168.188.1 ...where only the second reply would be routed back. Reported-by: Paul Holzinger <pholzing(a)redhat.com> Fixes: 33482d5bf293 ("passt: Add PASTA mode, major rework") Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>Reviewed-by: David Gibson <david(a)gibson.dropbear.id.au>--- icmp.c | 16 ++++++++++++++-- icmp.h | 1 + passt.c | 3 +++ 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/icmp.c b/icmp.c index 9caa7e6..4ee847f 100644 --- a/icmp.c +++ b/icmp.c @@ -44,12 +44,12 @@ /** * struct icmp_id_sock - Tracking information for single ICMP echo identifier * @sock: Bound socket for identifier - * @seq: Last sequence number sent to tap, host order + * @seq: Last sequence number sent to tap, host order, -1: not sent yet * @ts: Last associated activity from tap, seconds */ struct icmp_id_sock { int sock; - uint16_t seq; + int seq; time_t ts; }; @@ -273,6 +273,7 @@ static void icmp_timer_one(const struct ctx *c, int v6, uint16_t id, epoll_ctl(c->epollfd, EPOLL_CTL_DEL, id_map->sock, NULL); close(id_map->sock); id_map->sock = 0; + id_map->seq = -1; } /** @@ -301,3 +302,14 @@ v6: goto v6; } } + +/** + * icmp_init() - Initialise sequences in ID map to -1 (no sequence sent yet) + */ +void icmp_init(void) +{ + unsigned i; + + for (i = 0; i < ICMP_NUM_IDS; i++) + icmp_id_map[V4][i].seq = icmp_id_map[V6][i].seq = -1; +} diff --git a/icmp.h b/icmp.h index 458ce31..275486d 100644 --- a/icmp.h +++ b/icmp.h @@ -15,6 +15,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, int icmp_tap_handler(const struct ctx *c, int af, const void *addr, const struct pool *p, const struct timespec *now); void icmp_timer(const struct ctx *c, const struct timespec *ts); +void icmp_init(void); /** * union icmp_epoll_ref - epoll reference portion for ICMP tracking diff --git a/passt.c b/passt.c index ff4ee5d..34cd832 100644 --- a/passt.c +++ b/passt.c @@ -256,6 +256,9 @@ int main(int argc, char **argv) if ((!c.no_udp && udp_init(&c)) || (!c.no_tcp && tcp_init(&c))) exit(EXIT_FAILURE); + if (!c.no_icmp) + icmp_init(); + proto_update_l2_buf(c.mac_guest, c.mac, &c.ip4.addr); if (c.ifi4 && !c.no_dhcp)-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson