Here's my latest revision of some of the basics of the flow table. So far it's basically just a renaming of the existing TCP connection table. It's used for some new logging helpers, but otherwise doesn't really function any differently. However, this subset of the flow table work no longer bloats flow/connection entries over a single cache line. That removes the most prominent drawback of earlier revisions, meaning I think this series is ready for merge now. Doing so will mean the later series making more substantive changes to the flow behaviour are simpler. David Gibson (6): test: Make handling of shell prompts with escapes a little more reliable flow, tcp: Generalise connection types flow, tcp: Move TCP connection table to unified flow table flow, tcp: Consolidate flow pointer<->index helpers flow: Make unified version of flow table compaction flow, tcp: Add logging helpers for connection related messages Makefile | 14 +-- flow.c | 87 +++++++++++++++++++ flow.h | 60 +++++++++++++ flow_table.h | 45 ++++++++++ passt.h | 3 + tcp.c | 232 +++++++++++++++++++++++--------------------------- tcp.h | 5 -- tcp_conn.h | 46 +++------- tcp_splice.c | 88 +++++++++---------- test/lib/term | 6 +- 10 files changed, 361 insertions(+), 225 deletions(-) create mode 100644 flow.c create mode 100644 flow.h create mode 100644 flow_table.h -- 2.42.0
When using the old-style "pane" methods of executing commands during the tests, we need to scan the shell output for prompts in order to tell when commands have finished. This is inherently unreliable because commands could output things that look like prompts, and prompts might not look like we expect them to. The only way to really fix this is to use a better way of dispatching commands, like the newer "context" system. However, it's awkward to convert everything to "context" right at the moment, so we're still relying on some tests that do work most of the time. It is, however, particularly sensitive to fancy coloured prompts using escape sequences. Currently we try to handle this by stripping actual ESC characters with tr, then looking for some common variants. We can do a bit better: instead strip all escape sequences using sed before looking for our prompt. Or, at least, any one using [a-zA-Z] as the terminating character. Strictly speaking ANSI escapes can be terminated by any character in 0x40..0x7e, which isn't easily expressed in a regexp. This should capture all common ones, though. With this transformation we can simplify the list of patterns we then look for as a prompt, removing some redundant variants. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- test/lib/term | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/test/lib/term b/test/lib/term index aa05bf1..262937e 100755 --- a/test/lib/term +++ b/test/lib/term @@ -203,11 +203,9 @@ pane_wait() { __done=0 while - __l="$(tail -1 ${LOGDIR}/pane_${__lc}.log | tr -d [:cntrl:])" + __l="$(tail -1 ${LOGDIR}/pane_${__lc}.log | sed 's/[[][^a-zA-Z]*[a-zA-Z]//g')" case ${__l} in - '$ ' | '# ' | '# # ' | *"$ " | *"# ") return ;; - *" #[m " | *" #[m [K" | *"]# ["*) return ;; - *' $ [6n' | *' # [6n' ) return ;; + *"$ " | *"# ") return ;; esac do sleep 0.1 || sleep 1; done } -- 2.42.0
Currently TCP connections use a 1-bit selector, 'spliced', to determine the rest of the contents of the structure. We want to generalise the TCP connection table to other types of flows in other protocols. Make a start on this by replacing the tcp_conn_common structure with a new flow_common structure with an enum rather than a simple boolean indicating the type of flow. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- Makefile | 8 +++---- flow.c | 16 +++++++++++++ flow.h | 36 ++++++++++++++++++++++++++++++ tcp.c | 63 +++++++++++++++++++++++++++++++++++++--------------- tcp_conn.h | 24 ++++++-------------- tcp_splice.c | 3 ++- 6 files changed, 110 insertions(+), 40 deletions(-) create mode 100644 flow.c create mode 100644 flow.h diff --git a/Makefile b/Makefile index 6c53695..e2970e3 100644 --- a/Makefile +++ b/Makefile @@ -44,15 +44,15 @@ FLAGS += -DARCH=\"$(TARGET_ARCH)\" FLAGS += -DVERSION=\"$(VERSION)\" FLAGS += -DDUAL_STACK_SOCKETS=$(DUAL_STACK_SOCKETS) -PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c icmp.c igmp.c \ - isolation.c lineread.c log.c mld.c ndp.c netlink.c packet.c passt.c \ - pasta.c pcap.c port_fwd.c tap.c tcp.c tcp_splice.c udp.c util.c +PASST_SRCS = arch.c arp.c checksum.c conf.c dhcp.c dhcpv6.c flow.c icmp.c \ + igmp.c isolation.c lineread.c log.c mld.c ndp.c netlink.c packet.c \ + passt.c pasta.c pcap.c port_fwd.c tap.c tcp.c tcp_splice.c udp.c util.c QRAP_SRCS = qrap.c SRCS = $(PASST_SRCS) $(QRAP_SRCS) MANPAGES = passt.1 pasta.1 qrap.1 -PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h icmp.h \ +PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h flow.h icmp.h \ inany.h isolation.h lineread.h log.h ndp.h netlink.h packet.h passt.h \ pasta.h pcap.h pif.h port_fwd.h siphash.h tap.h tcp.h tcp_conn.h \ tcp_splice.h udp.h util.h diff --git a/flow.c b/flow.c new file mode 100644 index 0000000..7c3603c --- /dev/null +++ b/flow.c @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * Copyright Red Hat + * Author: David Gibson <david(a)gibson.dropbear.id.au> + * + * Tracking for logical "flows" of packets. + */ + +#include <stdint.h> + +#include "flow.h" + +const char *flow_type_str[] = { + [FLOW_TYPE_NONE] = "<none>", + [FLOW_TCP] = "TCP connection", + [FLOW_TCP_SPLICE] = "TCP connection (spliced)", +}; diff --git a/flow.h b/flow.h new file mode 100644 index 0000000..88e6f0b --- /dev/null +++ b/flow.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * Copyright Red Hat + * Author: David Gibson <david(a)gibson.dropbear.id.au> + * + * Tracking for logical "flows" of packets. + */ +#ifndef FLOW_H +#define FLOW_H + +/** + * enum flow_type - Different types of packet flows we track + */ +enum flow_type { + /* Represents an invalid or unused flow */ + FLOW_TYPE_NONE = 0, + /* A TCP connection between a socket and tap interface */ + FLOW_TCP, + /* A TCP connection between a host socket and ns socket */ + FLOW_TCP_SPLICE, + + FLOW_TYPE_MAX = FLOW_TCP_SPLICE, +}; + +extern const char *flow_type_str[]; +#define FLOW_TYPE(f) \ + ((f)->type <= FLOW_TYPE_MAX ? flow_type_str[(f)->type] : "?") + +/** + * struct flow_common - Common fields for packet flows + * @type: Type of packet flow + */ +struct flow_common { + uint8_t type; +}; + +#endif /* FLOW_H */ diff --git a/tcp.c b/tcp.c index 40e3dec..1c25032 100644 --- a/tcp.c +++ b/tcp.c @@ -299,6 +299,7 @@ #include "tcp_splice.h" #include "log.h" #include "inany.h" +#include "flow.h" #include "tcp_conn.h" @@ -584,7 +585,7 @@ static inline struct tcp_tap_conn *conn_at_idx(int idx) { if ((idx < 0) || (idx >= TCP_MAX_CONNS)) return NULL; - ASSERT(!(CONN(idx)->c.spliced)); + ASSERT(CONN(idx)->f.type == FLOW_TCP); return CONN(idx); } @@ -1319,14 +1320,21 @@ void tcp_table_compact(struct ctx *c, union tcp_conn *hole) from = tc + c->tcp.conn_count; memcpy(hole, from, sizeof(*hole)); - if (from->c.spliced) - tcp_splice_conn_update(c, &hole->splice); - else + switch (from->f.type) { + case FLOW_TCP: tcp_tap_conn_update(c, &from->tap, &hole->tap); + break; + case FLOW_TCP_SPLICE: + tcp_splice_conn_update(c, &hole->splice); + break; + default: + die("Unexpected %s in tcp_table_compact()", + FLOW_TYPE(&from->f)); + } - debug("TCP: table compaction (spliced=%d): old index %li, new index %li, " + debug("TCP: table compaction (%s): old index %li, new index %li, " "from: %p, to: %p", - from->c.spliced, CONN_IDX(from), CONN_IDX(hole), + FLOW_TYPE(&from->f), CONN_IDX(from), CONN_IDX(hole), (void *)from, (void *)hole); memset(from, 0, sizeof(*from)); @@ -1402,12 +1410,18 @@ void tcp_defer_handler(struct ctx *c) tcp_l2_data_buf_flush(c); for (conn = tc + c->tcp.conn_count - 1; conn >= tc; conn--) { - if (conn->c.spliced) { - if (conn->splice.flags & CLOSING) - tcp_splice_destroy(c, conn); - } else { + switch (conn->f.type) { + case FLOW_TCP: if (conn->tap.events == CLOSED) tcp_conn_destroy(c, conn); + break; + case FLOW_TCP_SPLICE: + if (conn->splice.flags & CLOSING) + tcp_splice_destroy(c, conn); + break; + default: + die("Unexpected %s in tcp_defer_handler()", + FLOW_TYPE(&conn->f)); } } } @@ -2017,7 +2031,7 @@ static void tcp_conn_from_tap(struct ctx *c, } conn = CONN(c->tcp.conn_count++); - conn->c.spliced = false; + conn->f.type = FLOW_TCP; conn->sock = s; conn->timer = -1; conn_event(c, conn, TAP_SYN_RCVD); @@ -2726,7 +2740,7 @@ static void tcp_tap_conn_from_sock(struct ctx *c, const struct sockaddr *sa, const struct timespec *now) { - conn->c.spliced = false; + conn->f.type = FLOW_TCP; conn->sock = s; conn->timer = -1; conn->ws_to_tap = conn->ws_from_tap = 0; @@ -2909,10 +2923,17 @@ void tcp_sock_handler(struct ctx *c, union epoll_ref ref, uint32_t events) { union tcp_conn *conn = tc + ref.tcp.index; - if (conn->c.spliced) - tcp_splice_sock_handler(c, &conn->splice, ref.fd, events); - else + switch (conn->f.type) { + case FLOW_TCP: tcp_tap_sock_handler(c, &conn->tap, events); + break; + case FLOW_TCP_SPLICE: + tcp_splice_sock_handler(c, &conn->splice, ref.fd, events); + break; + default: + die("Unexpected %s in tcp_sock_handler_compact()", + FLOW_TYPE(&conn->f)); + } } /** @@ -3244,11 +3265,17 @@ void tcp_timer(struct ctx *c, const struct timespec *ts) } for (conn = tc + c->tcp.conn_count - 1; conn >= tc; conn--) { - if (conn->c.spliced) { - tcp_splice_timer(c, conn); - } else { + switch (conn->f.type) { + case FLOW_TCP: if (conn->tap.events == CLOSED) tcp_conn_destroy(c, conn); + break; + case FLOW_TCP_SPLICE: + tcp_splice_timer(c, conn); + break; + default: + die("Unexpected %s in tcp_timer()", + FLOW_TYPE(&conn->f)); } } diff --git a/tcp_conn.h b/tcp_conn.h index 0c6a35b..136f8da 100644 --- a/tcp_conn.h +++ b/tcp_conn.h @@ -9,19 +9,9 @@ #ifndef TCP_CONN_H #define TCP_CONN_H -/** - * struct tcp_conn_common - Common fields for spliced and non-spliced - * @spliced: Is this a spliced connection? - */ -struct tcp_conn_common { - bool spliced :1; -}; - -extern const char *tcp_common_flag_str[]; - /** * struct tcp_tap_conn - Descriptor for a TCP connection (not spliced) - * @c: Fields common with tcp_splice_conn + * @f: Generic flow information * @in_epoll: Is the connection in the epoll set? * @next_index: Connection index of next item in hash chain, -1 for none * @tap_mss: MSS advertised by tap/guest, rounded to 2 ^ TCP_MSS_BITS @@ -46,8 +36,8 @@ extern const char *tcp_common_flag_str[]; * @seq_init_from_tap: Initial sequence number from tap */ struct tcp_tap_conn { - /* Must be first element to match tcp_splice_conn */ - struct tcp_conn_common c; + /* Must be first element */ + struct flow_common f; bool in_epoll :1; int next_index :TCP_CONN_INDEX_BITS + 2; @@ -121,7 +111,7 @@ struct tcp_tap_conn { #define SIDES 2 /** * struct tcp_splice_conn - Descriptor for a spliced TCP connection - * @c: Fields common with tcp_tap_conn + * @f: Generic flow information * @in_epoll: Is the connection in the epoll set? * @s: File descriptor for sockets * @pipe: File descriptors for pipes @@ -131,8 +121,8 @@ struct tcp_tap_conn { * @written: Bytes written (not fully written from one other side read) */ struct tcp_splice_conn { - /* Must be first element to match tcp_tap_conn */ - struct tcp_conn_common c; + /* Must be first element */ + struct flow_common f; bool in_epoll :1; int s[SIDES]; @@ -168,7 +158,7 @@ struct tcp_splice_conn { * @splice: Fields specific to spliced connections */ union tcp_conn { - struct tcp_conn_common c; + struct flow_common f; struct tcp_tap_conn tap; struct tcp_splice_conn splice; }; diff --git a/tcp_splice.c b/tcp_splice.c index a5c1332..bfd2bd1 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -54,6 +54,7 @@ #include "tcp_splice.h" #include "siphash.h" #include "inany.h" +#include "flow.h" #include "tcp_conn.h" @@ -476,7 +477,7 @@ bool tcp_splice_conn_from_sock(const struct ctx *c, if (setsockopt(s, SOL_TCP, TCP_QUICKACK, &((int){ 1 }), sizeof(int))) trace("TCP (spliced): failed to set TCP_QUICKACK on %i", s); - conn->c.spliced = true; + conn->f.type = FLOW_TCP_SPLICE; conn->s[0] = s; if (tcp_splice_new(c, conn, ref.port, ref.pif)) -- 2.42.0
We want to generalise "connection" tracking to things other than true TCP connections. Continue implenenting this by renaming the TCP connection table to the "flow table" and moving it to flow.c. The definitions are split between flow.h and flow_table.h - we need this separation to avoid circular dependencies: the definitions in flow.h will be needed by many headers using the flow mechanism, but flow_table.h needs all those protocol specific headers in order to define the full flow table entry. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- Makefile | 8 ++--- flow.c | 11 +++++++ flow.h | 8 +++++ flow_table.h | 25 +++++++++++++++ passt.h | 3 ++ tcp.c | 87 +++++++++++++++++++++++++--------------------------- tcp.h | 5 --- tcp_conn.h | 23 +++----------- tcp_splice.c | 19 ++++++------ 9 files changed, 107 insertions(+), 82 deletions(-) create mode 100644 flow_table.h diff --git a/Makefile b/Makefile index e2970e3..d14b029 100644 --- a/Makefile +++ b/Makefile @@ -52,10 +52,10 @@ SRCS = $(PASST_SRCS) $(QRAP_SRCS) MANPAGES = passt.1 pasta.1 qrap.1 -PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h flow.h icmp.h \ - inany.h isolation.h lineread.h log.h ndp.h netlink.h packet.h passt.h \ - pasta.h pcap.h pif.h port_fwd.h siphash.h tap.h tcp.h tcp_conn.h \ - tcp_splice.h udp.h util.h +PASST_HEADERS = arch.h arp.h checksum.h conf.h dhcp.h dhcpv6.h flow.h \ + flow_table.h icmp.h inany.h isolation.h lineread.h log.h ndp.h \ + netlink.h packet.h passt.h pasta.h pcap.h pif.h port_fwd.h siphash.h \ + tap.h tcp.h tcp_conn.h tcp_splice.h udp.h util.h HEADERS = $(PASST_HEADERS) seccomp.h C := \#include <linux/tcp.h>\nstruct tcp_info x = { .tcpi_snd_wnd = 0 }; diff --git a/flow.c b/flow.c index 7c3603c..a9bb8f5 100644 --- a/flow.c +++ b/flow.c @@ -6,11 +6,22 @@ */ #include <stdint.h> +#include <unistd.h> +#include <string.h> +#include "util.h" +#include "passt.h" +#include "siphash.h" +#include "inany.h" #include "flow.h" +#include "tcp_conn.h" +#include "flow_table.h" const char *flow_type_str[] = { [FLOW_TYPE_NONE] = "<none>", [FLOW_TCP] = "TCP connection", [FLOW_TCP_SPLICE] = "TCP connection (spliced)", }; + +/* Global Flow Table */ +union flow flowtab[FLOW_MAX]; diff --git a/flow.h b/flow.h index 88e6f0b..57289a8 100644 --- a/flow.h +++ b/flow.h @@ -33,4 +33,12 @@ struct flow_common { uint8_t type; }; +#define FLOW_INDEX_BITS 17 /* 128k - 1 */ +#define FLOW_MAX MAX_FROM_BITS(FLOW_INDEX_BITS) + +#define FLOW_TABLE_PRESSURE 30 /* % of FLOW_MAX */ +#define FLOW_FILE_PRESSURE 30 /* % of c->nofile */ + +union flow; + #endif /* FLOW_H */ diff --git a/flow_table.h b/flow_table.h new file mode 100644 index 0000000..c4c646b --- /dev/null +++ b/flow_table.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * Copyright Red Hat + * Author: David Gibson <david(a)gibson.dropbear.id.au> + * + * Definitions for the global table of packet flows. + */ +#ifndef FLOW_TABLE_H +#define FLOW_TABLE_H + +/** + * union flow - Descriptor for a logical packet flow (e.g. connection) + * @f: Fields common between all variants + * @tcp: Fields for non-spliced TCP connections + * @tcp_splice: Fields for spliced TCP connections +*/ +union flow { + struct flow_common f; + struct tcp_tap_conn tcp; + struct tcp_splice_conn tcp_splice; +}; + +/* Global Flow Table */ +extern union flow flowtab[]; + +#endif /* FLOW_TABLE_H */ diff --git a/passt.h b/passt.h index cac720a..3f5dfb9 100644 --- a/passt.h +++ b/passt.h @@ -219,6 +219,7 @@ struct ip6_ctx { * @pasta_conf_ns: Configure namespace after creating it * @no_copy_routes: Don't copy all routes when configuring target namespace * @no_copy_addrs: Don't copy all addresses when configuring namespace + * @flow_count: Number of tracked packet flows (connections etc.) * @no_tcp: Disable TCP operation * @tcp: Context for TCP protocol handler * @no_tcp: Disable UDP operation @@ -277,6 +278,8 @@ struct ctx { int no_copy_routes; int no_copy_addrs; + unsigned flow_count; + int no_tcp; struct tcp_ctx tcp; int no_udp; diff --git a/tcp.c b/tcp.c index 1c25032..0119bd3 100644 --- a/tcp.c +++ b/tcp.c @@ -302,14 +302,14 @@ #include "flow.h" #include "tcp_conn.h" +#include "flow_table.h" #define TCP_FRAMES_MEM 128 #define TCP_FRAMES \ (c->mode == MODE_PASST ? TCP_FRAMES_MEM : 1) #define TCP_HASH_TABLE_LOAD 70 /* % */ -#define TCP_HASH_TABLE_SIZE (TCP_MAX_CONNS * 100 / \ - TCP_HASH_TABLE_LOAD) +#define TCP_HASH_TABLE_SIZE (FLOW_MAX * 100 / TCP_HASH_TABLE_LOAD) #define MAX_WS 8 #define MAX_WINDOW (1 << (16 + (MAX_WS))) @@ -570,11 +570,8 @@ tcp6_l2_flags_buf[TCP_FRAMES_MEM]; static unsigned int tcp6_l2_flags_buf_used; -/* TCP connections */ -union tcp_conn tc[TCP_MAX_CONNS]; - -#define CONN(idx) (&tc[(idx)].tap) -#define CONN_IDX(conn) ((union tcp_conn *)(conn) - tc) +#define CONN(idx) (&flowtab[(idx)].tcp) +#define CONN_IDX(conn) ((union flow *)(conn) - flowtab) /** conn_at_idx() - Find a connection by index, if present * @idx: Index of connection to lookup @@ -583,7 +580,7 @@ union tcp_conn tc[TCP_MAX_CONNS]; */ static inline struct tcp_tap_conn *conn_at_idx(int idx) { - if ((idx < 0) || (idx >= TCP_MAX_CONNS)) + if ((idx < 0) || (idx >= FLOW_MAX)) return NULL; ASSERT(CONN(idx)->f.type == FLOW_TCP); return CONN(idx); @@ -1306,26 +1303,26 @@ static struct tcp_tap_conn *tcp_hash_lookup(const struct ctx *c, * @c: Execution context * @hole: Pointer to recently closed connection */ -void tcp_table_compact(struct ctx *c, union tcp_conn *hole) +void tcp_table_compact(struct ctx *c, union flow *hole) { - union tcp_conn *from; + union flow *from; - if (CONN_IDX(hole) == --c->tcp.conn_count) { + if (CONN_IDX(hole) == --c->flow_count) { debug("TCP: table compaction: maximum index was %li (%p)", CONN_IDX(hole), (void *)hole); memset(hole, 0, sizeof(*hole)); return; } - from = tc + c->tcp.conn_count; + from = flowtab + c->flow_count; memcpy(hole, from, sizeof(*hole)); switch (from->f.type) { case FLOW_TCP: - tcp_tap_conn_update(c, &from->tap, &hole->tap); + tcp_tap_conn_update(c, &from->tcp, &hole->tcp); break; case FLOW_TCP_SPLICE: - tcp_splice_conn_update(c, &hole->splice); + tcp_splice_conn_update(c, &hole->tcp_splice); break; default: die("Unexpected %s in tcp_table_compact()", @@ -1343,18 +1340,18 @@ void tcp_table_compact(struct ctx *c, union tcp_conn *hole) /** * tcp_conn_destroy() - Close sockets, trigger hash table removal and compaction * @c: Execution context - * @conn_union: Connection pointer (container union) + * @flow: Flow table entry for this connection */ -static void tcp_conn_destroy(struct ctx *c, union tcp_conn *conn_union) +static void tcp_conn_destroy(struct ctx *c, union flow *flow) { - const struct tcp_tap_conn *conn = &conn_union->tap; + const struct tcp_tap_conn *conn = &flow->tcp; close(conn->sock); if (conn->timer != -1) close(conn->timer); tcp_hash_remove(c, conn); - tcp_table_compact(c, conn_union); + tcp_table_compact(c, flow); } static void tcp_rst_do(struct ctx *c, struct tcp_tap_conn *conn); @@ -1404,24 +1401,24 @@ static void tcp_l2_data_buf_flush(const struct ctx *c) */ void tcp_defer_handler(struct ctx *c) { - union tcp_conn *conn; + union flow *flow; tcp_l2_flags_buf_flush(c); tcp_l2_data_buf_flush(c); - for (conn = tc + c->tcp.conn_count - 1; conn >= tc; conn--) { - switch (conn->f.type) { + for (flow = flowtab + c->flow_count - 1; flow >= flowtab; flow--) { + switch (flow->f.type) { case FLOW_TCP: - if (conn->tap.events == CLOSED) - tcp_conn_destroy(c, conn); + if (flow->tcp.events == CLOSED) + tcp_conn_destroy(c, flow); break; case FLOW_TCP_SPLICE: - if (conn->splice.flags & CLOSING) - tcp_splice_destroy(c, conn); + if (flow->tcp_splice.flags & CLOSING) + tcp_splice_destroy(c, flow); break; default: die("Unexpected %s in tcp_defer_handler()", - FLOW_TYPE(&conn->f)); + FLOW_TYPE(&flow->f)); } } } @@ -2004,7 +2001,7 @@ static void tcp_conn_from_tap(struct ctx *c, (void)saddr; - if (c->tcp.conn_count >= TCP_MAX_CONNS) + if (c->flow_count >= FLOW_MAX) return; if ((s = tcp_conn_pool_sock(pool)) < 0) @@ -2030,7 +2027,7 @@ static void tcp_conn_from_tap(struct ctx *c, } } - conn = CONN(c->tcp.conn_count++); + conn = CONN(c->flow_count++); conn->f.type = FLOW_TCP; conn->sock = s; conn->timer = -1; @@ -2775,24 +2772,24 @@ void tcp_listen_handler(struct ctx *c, union epoll_ref ref, { struct sockaddr_storage sa; socklen_t sl = sizeof(sa); - union tcp_conn *conn; + union flow *flow; int s; - if (c->no_tcp || c->tcp.conn_count >= TCP_MAX_CONNS) + if (c->no_tcp || c->flow_count >= FLOW_MAX) return; s = accept4(ref.fd, (struct sockaddr *)&sa, &sl, SOCK_NONBLOCK); if (s < 0) return; - conn = tc + c->tcp.conn_count++; + flow = flowtab + c->flow_count++; if (c->mode == MODE_PASTA && - tcp_splice_conn_from_sock(c, ref.tcp_listen, &conn->splice, + tcp_splice_conn_from_sock(c, ref.tcp_listen, &flow->tcp_splice, s, (struct sockaddr *)&sa)) return; - tcp_tap_conn_from_sock(c, ref.tcp_listen, &conn->tap, s, + tcp_tap_conn_from_sock(c, ref.tcp_listen, &flow->tcp, s, (struct sockaddr *)&sa, now); } @@ -2921,18 +2918,18 @@ static void tcp_tap_sock_handler(struct ctx *c, struct tcp_tap_conn *conn, */ void tcp_sock_handler(struct ctx *c, union epoll_ref ref, uint32_t events) { - union tcp_conn *conn = tc + ref.tcp.index; + union flow *flow = flowtab + ref.tcp.index; - switch (conn->f.type) { + switch (flow->f.type) { case FLOW_TCP: - tcp_tap_sock_handler(c, &conn->tap, events); + tcp_tap_sock_handler(c, &flow->tcp, events); break; case FLOW_TCP_SPLICE: - tcp_splice_sock_handler(c, &conn->splice, ref.fd, events); + tcp_splice_sock_handler(c, &flow->tcp_splice, ref.fd, events); break; default: die("Unexpected %s in tcp_sock_handler_compact()", - FLOW_TYPE(&conn->f)); + FLOW_TYPE(&flow->f)); } } @@ -3248,7 +3245,7 @@ static int tcp_port_rebind_outbound(void *arg) */ void tcp_timer(struct ctx *c, const struct timespec *ts) { - union tcp_conn *conn; + union flow *flow; (void)ts; @@ -3264,18 +3261,18 @@ void tcp_timer(struct ctx *c, const struct timespec *ts) } } - for (conn = tc + c->tcp.conn_count - 1; conn >= tc; conn--) { - switch (conn->f.type) { + for (flow = flowtab + c->flow_count - 1; flow >= flowtab; flow--) { + switch (flow->f.type) { case FLOW_TCP: - if (conn->tap.events == CLOSED) - tcp_conn_destroy(c, conn); + if (flow->tcp.events == CLOSED) + tcp_conn_destroy(c, flow); break; case FLOW_TCP_SPLICE: - tcp_splice_timer(c, conn); + tcp_splice_timer(c, flow); break; default: die("Unexpected %s in tcp_timer()", - FLOW_TYPE(&conn->f)); + FLOW_TYPE(&flow->f)); } } diff --git a/tcp.h b/tcp.h index 06965d8..c8b738d 100644 --- a/tcp.h +++ b/tcp.h @@ -8,9 +8,6 @@ #define TCP_TIMER_INTERVAL 1000 /* ms */ -#define TCP_CONN_INDEX_BITS 17 /* 128k - 1 */ -#define TCP_MAX_CONNS MAX_FROM_BITS(TCP_CONN_INDEX_BITS) - struct ctx; void tcp_timer_handler(struct ctx *c, union epoll_ref ref); @@ -56,7 +53,6 @@ union tcp_listen_epoll_ref { /** * struct tcp_ctx - Execution context for TCP routines * @hash_secret: 128-bit secret for hash functions, ISN and hash table - * @conn_count: Count of total connections in connection table * @port_to_tap: Ports bound host-side, packets to tap or spliced * @fwd_in: Port forwarding configuration for inbound packets * @fwd_out: Port forwarding configuration for outbound packets @@ -66,7 +62,6 @@ union tcp_listen_epoll_ref { */ struct tcp_ctx { uint64_t hash_secret[2]; - int conn_count; struct port_fwd fwd_in; struct port_fwd fwd_out; struct timespec timer_run; diff --git a/tcp_conn.h b/tcp_conn.h index 136f8da..5a107fc 100644 --- a/tcp_conn.h +++ b/tcp_conn.h @@ -40,7 +40,7 @@ struct tcp_tap_conn { struct flow_common f; bool in_epoll :1; - int next_index :TCP_CONN_INDEX_BITS + 2; + int next_index :FLOW_INDEX_BITS + 2; #define TCP_RETRANS_BITS 3 unsigned int retrans :TCP_RETRANS_BITS; @@ -151,21 +151,6 @@ struct tcp_splice_conn { uint32_t written[SIDES]; }; -/** - * union tcp_conn - Descriptor for a TCP connection (spliced or non-spliced) - * @c: Fields common between all variants - * @tap: Fields specific to non-spliced connections - * @splice: Fields specific to spliced connections -*/ -union tcp_conn { - struct flow_common f; - struct tcp_tap_conn tap; - struct tcp_splice_conn splice; -}; - -/* TCP connections */ -extern union tcp_conn tc[]; - /* Socket pools */ #define TCP_SOCK_POOL_SIZE 32 @@ -173,9 +158,9 @@ extern int init_sock_pool4 [TCP_SOCK_POOL_SIZE]; extern int init_sock_pool6 [TCP_SOCK_POOL_SIZE]; void tcp_splice_conn_update(const struct ctx *c, struct tcp_splice_conn *new); -void tcp_table_compact(struct ctx *c, union tcp_conn *hole); -void tcp_splice_destroy(struct ctx *c, union tcp_conn *conn_union); -void tcp_splice_timer(struct ctx *c, union tcp_conn *conn_union); +void tcp_table_compact(struct ctx *c, union flow *hole); +void tcp_splice_destroy(struct ctx *c, union flow *flow); +void tcp_splice_timer(struct ctx *c, union flow *flow); int tcp_conn_pool_sock(int pool[]); int tcp_conn_new_sock(const struct ctx *c, sa_family_t af); void tcp_sock_refill_pool(const struct ctx *c, int pool[], int af); diff --git a/tcp_splice.c b/tcp_splice.c index bfd2bd1..9f4831a 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -57,6 +57,7 @@ #include "flow.h" #include "tcp_conn.h" +#include "flow_table.h" #define MAX_PIPE_SIZE (8UL * 1024 * 1024) #define TCP_SPLICE_PIPE_POOL_SIZE 32 @@ -76,7 +77,7 @@ static int splice_pipe_pool [TCP_SPLICE_PIPE_POOL_SIZE][2]; #define CONN_V4(x) (!CONN_V6(x)) #define CONN_HAS(conn, set) ((conn->events & (set)) == (set)) #define CONN(idx) (&tc[(idx)].splice) -#define CONN_IDX(conn) ((union tcp_conn *)(conn) - tc) +#define CONN_IDX(conn) ((union flow *)(conn) - flowtab) /* Display strings for connection events */ static const char *tcp_splice_event_str[] __attribute((__unused__)) = { @@ -254,11 +255,11 @@ void tcp_splice_conn_update(const struct ctx *c, struct tcp_splice_conn *new) /** * tcp_splice_destroy() - Close spliced connection and pipes, clear * @c: Execution context - * @conn_union: Spliced connection (container union) + * @flow: Flow table entry */ -void tcp_splice_destroy(struct ctx *c, union tcp_conn *conn_union) +void tcp_splice_destroy(struct ctx *c, union flow *flow) { - struct tcp_splice_conn *conn = &conn_union->splice; + struct tcp_splice_conn *conn = &flow->tcp_splice; int side; for (side = 0; side < SIDES; side++) { @@ -283,7 +284,7 @@ void tcp_splice_destroy(struct ctx *c, union tcp_conn *conn_union) conn->flags = 0; debug("TCP (spliced): index %li, CLOSED", CONN_IDX(conn)); - tcp_table_compact(c, conn_union); + tcp_table_compact(c, flow); } /** @@ -775,15 +776,15 @@ void tcp_splice_init(struct ctx *c) /** * tcp_splice_timer() - Timer for spliced connections * @c: Execution context - * @conn_union: Spliced connection (container union) + * @flow: Flow table entry */ -void tcp_splice_timer(struct ctx *c, union tcp_conn *conn_union) +void tcp_splice_timer(struct ctx *c, union flow *flow) { - struct tcp_splice_conn *conn = &conn_union->splice; + struct tcp_splice_conn *conn = &flow->tcp_splice; int side; if (conn->flags & CLOSING) { - tcp_splice_destroy(c, conn_union); + tcp_splice_destroy(c, flow); return; } -- 2.42.0
Both tcp.c and tcp_splice.c define CONN_IDX() variants to find the index of their connection structures in the connection table, now become the unified flow table. We can easily combine these into a common helper. While we're there, add some trickery for some additional type safety. They also define their own CONN() versions, which aren't so easily combined since they need to return different types, but we can have them use a common helper. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- flow_table.h | 20 +++++++++++++++++ tcp.c | 61 ++++++++++++++++++++++++++-------------------------- tcp_conn.h | 2 +- tcp_splice.c | 21 +++++++++--------- 4 files changed, 61 insertions(+), 43 deletions(-) diff --git a/flow_table.h b/flow_table.h index c4c646b..ed4f578 100644 --- a/flow_table.h +++ b/flow_table.h @@ -22,4 +22,24 @@ union flow { /* Global Flow Table */ extern union flow flowtab[]; + +/** flow_idx - Index of flow from common structure + * @f: Common flow fields pointer + * + * Return: index of @f in the flow table + */ +static inline unsigned flow_idx(const struct flow_common *f) +{ + return (union flow *)f - flowtab; +} + +/** FLOW_IDX - Find the index of a flow + * @f_: Flow pointer, either union flow * or protocol specific + * + * Return: index of @f in the flow table + */ +#define FLOW_IDX(f_) (flow_idx(&(f_)->f)) + +#define FLOW(idx) (&flowtab[(idx)]) + #endif /* FLOW_TABLE_H */ diff --git a/tcp.c b/tcp.c index 0119bd3..b2ea95e 100644 --- a/tcp.c +++ b/tcp.c @@ -570,8 +570,7 @@ tcp6_l2_flags_buf[TCP_FRAMES_MEM]; static unsigned int tcp6_l2_flags_buf_used; -#define CONN(idx) (&flowtab[(idx)].tcp) -#define CONN_IDX(conn) ((union flow *)(conn) - flowtab) +#define CONN(idx) (&(FLOW(idx)->tcp)) /** conn_at_idx() - Find a connection by index, if present * @idx: Index of connection to lookup @@ -640,7 +639,7 @@ static int tcp_epoll_ctl(const struct ctx *c, struct tcp_tap_conn *conn) { int m = conn->in_epoll ? EPOLL_CTL_MOD : EPOLL_CTL_ADD; union epoll_ref ref = { .type = EPOLL_TYPE_TCP, .fd = conn->sock, - .tcp.index = CONN_IDX(conn) }; + .tcp.index = FLOW_IDX(conn) }; struct epoll_event ev = { .data.u64 = ref.u64 }; if (conn->events == CLOSED) { @@ -661,7 +660,7 @@ static int tcp_epoll_ctl(const struct ctx *c, struct tcp_tap_conn *conn) if (conn->timer != -1) { union epoll_ref ref_t = { .type = EPOLL_TYPE_TCP_TIMER, .fd = conn->sock, - .tcp.index = CONN_IDX(conn) }; + .tcp.index = FLOW_IDX(conn) }; struct epoll_event ev_t = { .data.u64 = ref_t.u64, .events = EPOLLIN | EPOLLET }; @@ -689,7 +688,7 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) if (conn->timer == -1) { union epoll_ref ref = { .type = EPOLL_TYPE_TCP_TIMER, .fd = conn->sock, - .tcp.index = CONN_IDX(conn) }; + .tcp.index = FLOW_IDX(conn) }; struct epoll_event ev = { .data.u64 = ref.u64, .events = EPOLLIN | EPOLLET }; int fd; @@ -725,7 +724,7 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) it.it_value.tv_sec = ACT_TIMEOUT; } - debug("TCP: index %li, timer expires in %lu.%03lus", CONN_IDX(conn), + debug("TCP: index %u, timer expires in %lu.%03lus", FLOW_IDX(conn), it.it_value.tv_sec, it.it_value.tv_nsec / 1000 / 1000); timerfd_settime(conn->timer, 0, &it, NULL); @@ -748,7 +747,7 @@ static void conn_flag_do(const struct ctx *c, struct tcp_tap_conn *conn, conn->flags &= flag; if (flag_index >= 0) { - debug("TCP: index %li: %s dropped", CONN_IDX(conn), + debug("TCP: index %u: %s dropped", FLOW_IDX(conn), tcp_flag_str[flag_index]); } } else { @@ -769,7 +768,7 @@ static void conn_flag_do(const struct ctx *c, struct tcp_tap_conn *conn, conn->flags |= flag; if (flag_index >= 0) { - debug("TCP: index %li: %s", CONN_IDX(conn), + debug("TCP: index %u: %s", FLOW_IDX(conn), tcp_flag_str[flag_index]); } } @@ -819,12 +818,12 @@ static void conn_event_do(const struct ctx *c, struct tcp_tap_conn *conn, new += 5; if (prev != new) { - debug("TCP: index %li, %s: %s -> %s", CONN_IDX(conn), + debug("TCP: index %u, %s: %s -> %s", FLOW_IDX(conn), num == -1 ? "CLOSED" : tcp_event_str[num], prev == -1 ? "CLOSED" : tcp_state_str[prev], (new == -1 || num == -1) ? "CLOSED" : tcp_state_str[new]); } else { - debug("TCP: index %li, %s", CONN_IDX(conn), + debug("TCP: index %u, %s", FLOW_IDX(conn), num == -1 ? "CLOSED" : tcp_event_str[num]); } @@ -1204,11 +1203,11 @@ static void tcp_hash_insert(const struct ctx *c, struct tcp_tap_conn *conn) int b; b = tcp_hash(c, &conn->faddr, conn->eport, conn->fport); - conn->next_index = tc_hash[b] ? CONN_IDX(tc_hash[b]) : -1; + conn->next_index = tc_hash[b] ? FLOW_IDX(tc_hash[b]) : -1U; tc_hash[b] = conn; - debug("TCP: hash table insert: index %li, sock %i, bucket: %i, next: " - "%p", CONN_IDX(conn), conn->sock, b, + debug("TCP: hash table insert: index %u, sock %i, bucket: %i, next: " + "%p", FLOW_IDX(conn), conn->sock, b, (void *)conn_at_idx(conn->next_index)); } @@ -1234,8 +1233,8 @@ static void tcp_hash_remove(const struct ctx *c, } } - debug("TCP: hash table remove: index %li, sock %i, bucket: %i, new: %p", - CONN_IDX(conn), conn->sock, b, + debug("TCP: hash table remove: index %u, sock %i, bucket: %i, new: %p", + FLOW_IDX(conn), conn->sock, b, (void *)(prev ? conn_at_idx(prev->next_index) : tc_hash[b])); } @@ -1255,16 +1254,16 @@ static void tcp_tap_conn_update(const struct ctx *c, struct tcp_tap_conn *old, prev = entry, entry = conn_at_idx(entry->next_index)) { if (entry == old) { if (prev) - prev->next_index = CONN_IDX(new); + prev->next_index = FLOW_IDX(new); else tc_hash[b] = new; break; } } - debug("TCP: hash table update: old index %li, new index %li, sock %i, " + debug("TCP: hash table update: old index %u, new index %u, sock %i, " "bucket: %i, old: %p, new: %p", - CONN_IDX(old), CONN_IDX(new), new->sock, b, + FLOW_IDX(old), FLOW_IDX(new), new->sock, b, (void *)old, (void *)new); tcp_epoll_ctl(c, new); @@ -1307,9 +1306,9 @@ void tcp_table_compact(struct ctx *c, union flow *hole) { union flow *from; - if (CONN_IDX(hole) == --c->flow_count) { - debug("TCP: table compaction: maximum index was %li (%p)", - CONN_IDX(hole), (void *)hole); + if (FLOW_IDX(hole) == --c->flow_count) { + debug("TCP: table compaction: maximum index was %u (%p)", + FLOW_IDX(hole), (void *)hole); memset(hole, 0, sizeof(*hole)); return; } @@ -1329,9 +1328,9 @@ void tcp_table_compact(struct ctx *c, union flow *hole) FLOW_TYPE(&from->f)); } - debug("TCP: table compaction (%s): old index %li, new index %li, " + debug("TCP: table compaction (%s): old index %u, new index %u, " "from: %p, to: %p", - FLOW_TYPE(&from->f), CONN_IDX(from), CONN_IDX(hole), + FLOW_TYPE(&from->f), FLOW_IDX(from), FLOW_IDX(hole), (void *)from, (void *)hole); memset(from, 0, sizeof(*from)); @@ -1357,7 +1356,7 @@ static void tcp_conn_destroy(struct ctx *c, union flow *flow) static void tcp_rst_do(struct ctx *c, struct tcp_tap_conn *conn); #define tcp_rst(c, conn) \ do { \ - debug("TCP: index %li, reset at %s:%i", CONN_IDX(conn), \ + debug("TCP: index %u, reset at %s:%i", FLOW_IDX(conn), \ __func__, __LINE__); \ tcp_rst_do(c, conn); \ } while (0) @@ -2581,7 +2580,7 @@ int tcp_tap_handler(struct ctx *c, uint8_t pif, int af, return 1; } - trace("TCP: packet length %lu from tap for index %lu", len, CONN_IDX(conn)); + trace("TCP: packet length %lu from tap for index %u", len, FLOW_IDX(conn)); if (th->rst) { conn_event(c, conn, CLOSED); @@ -2821,17 +2820,17 @@ void tcp_timer_handler(struct ctx *c, union epoll_ref ref) tcp_timer_ctl(c, conn); } else if (conn->flags & ACK_FROM_TAP_DUE) { if (!(conn->events & ESTABLISHED)) { - debug("TCP: index %li, handshake timeout", CONN_IDX(conn)); + debug("TCP: index %u, handshake timeout", FLOW_IDX(conn)); tcp_rst(c, conn); } else if (CONN_HAS(conn, SOCK_FIN_SENT | TAP_FIN_ACKED)) { - debug("TCP: index %li, FIN timeout", CONN_IDX(conn)); + debug("TCP: index %u, FIN timeout", FLOW_IDX(conn)); tcp_rst(c, conn); } else if (conn->retrans == TCP_MAX_RETRANS) { - debug("TCP: index %li, retransmissions count exceeded", - CONN_IDX(conn)); + debug("TCP: index %u, retransmissions count exceeded", + FLOW_IDX(conn)); tcp_rst(c, conn); } else { - debug("TCP: index %li, ACK timeout, retry", CONN_IDX(conn)); + debug("TCP: index %u, ACK timeout, retry", FLOW_IDX(conn)); conn->retrans++; conn->seq_to_tap = conn->seq_ack_from_tap; tcp_data_from_sock(c, conn); @@ -2849,7 +2848,7 @@ void tcp_timer_handler(struct ctx *c, union epoll_ref ref) */ timerfd_settime(conn->timer, 0, &new, &old); if (old.it_value.tv_sec == ACT_TIMEOUT) { - debug("TCP: index %li, activity timeout", CONN_IDX(conn)); + debug("TCP: index %u, activity timeout", FLOW_IDX(conn)); tcp_rst(c, conn); } } diff --git a/tcp_conn.h b/tcp_conn.h index 5a107fc..5a9376e 100644 --- a/tcp_conn.h +++ b/tcp_conn.h @@ -40,7 +40,7 @@ struct tcp_tap_conn { struct flow_common f; bool in_epoll :1; - int next_index :FLOW_INDEX_BITS + 2; + unsigned next_index :FLOW_INDEX_BITS + 2; #define TCP_RETRANS_BITS 3 unsigned int retrans :TCP_RETRANS_BITS; diff --git a/tcp_splice.c b/tcp_splice.c index 9f4831a..3955417 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -76,8 +76,7 @@ static int splice_pipe_pool [TCP_SPLICE_PIPE_POOL_SIZE][2]; #define CONN_V6(x) (x->flags & SPLICE_V6) #define CONN_V4(x) (!CONN_V6(x)) #define CONN_HAS(conn, set) ((conn->events & (set)) == (set)) -#define CONN(idx) (&tc[(idx)].splice) -#define CONN_IDX(conn) ((union flow *)(conn) - flowtab) +#define CONN(idx) (&FLOW(idx)->tcp_splice) /* Display strings for connection events */ static const char *tcp_splice_event_str[] __attribute((__unused__)) = { @@ -129,8 +128,8 @@ static int tcp_splice_epoll_ctl(const struct ctx *c, { int m = conn->in_epoll ? EPOLL_CTL_MOD : EPOLL_CTL_ADD; union epoll_ref ref[SIDES] = { - { .type = EPOLL_TYPE_TCP, .fd = conn->s[0], .tcp.index = CONN_IDX(conn) }, - { .type = EPOLL_TYPE_TCP, .fd = conn->s[1], .tcp.index = CONN_IDX(conn) } + { .type = EPOLL_TYPE_TCP, .fd = conn->s[0], .tcp.index = FLOW_IDX(conn) }, + { .type = EPOLL_TYPE_TCP, .fd = conn->s[1], .tcp.index = FLOW_IDX(conn) } }; struct epoll_event ev[SIDES] = { { .data.u64 = ref[0].u64 }, { .data.u64 = ref[1].u64 } }; @@ -140,8 +139,8 @@ static int tcp_splice_epoll_ctl(const struct ctx *c, if (epoll_ctl(c->epollfd, m, conn->s[0], &ev[0]) || epoll_ctl(c->epollfd, m, conn->s[1], &ev[1])) { int ret = -errno; - err("TCP (spliced): index %li, ERROR on epoll_ctl(): %s", - CONN_IDX(conn), strerror(errno)); + err("TCP (spliced): index %u, ERROR on epoll_ctl(): %s", + FLOW_IDX(conn), strerror(errno)); return ret; } @@ -167,7 +166,7 @@ static void conn_flag_do(const struct ctx *c, struct tcp_splice_conn *conn, conn->flags &= flag; if (flag_index >= 0) { - debug("TCP (spliced): index %li: %s dropped", CONN_IDX(conn), + debug("TCP (spliced): index %u: %s dropped", FLOW_IDX(conn), tcp_splice_flag_str[flag_index]); } } else { @@ -178,7 +177,7 @@ static void conn_flag_do(const struct ctx *c, struct tcp_splice_conn *conn, conn->flags |= flag; if (flag_index >= 0) { - debug("TCP (spliced): index %li: %s", CONN_IDX(conn), + debug("TCP (spliced): index %u: %s", FLOW_IDX(conn), tcp_splice_flag_str[flag_index]); } } @@ -213,7 +212,7 @@ static void conn_event_do(const struct ctx *c, struct tcp_splice_conn *conn, conn->events &= event; if (flag_index >= 0) { - debug("TCP (spliced): index %li, ~%s", CONN_IDX(conn), + debug("TCP (spliced): index %u, ~%s", FLOW_IDX(conn), tcp_splice_event_str[flag_index]); } } else { @@ -224,7 +223,7 @@ static void conn_event_do(const struct ctx *c, struct tcp_splice_conn *conn, conn->events |= event; if (flag_index >= 0) { - debug("TCP (spliced): index %li, %s", CONN_IDX(conn), + debug("TCP (spliced): index %u, %s", FLOW_IDX(conn), tcp_splice_event_str[flag_index]); } } @@ -282,7 +281,7 @@ void tcp_splice_destroy(struct ctx *c, union flow *flow) conn->events = SPLICE_CLOSED; conn->flags = 0; - debug("TCP (spliced): index %li, CLOSED", CONN_IDX(conn)); + debug("TCP (spliced): index %u, CLOSED", FLOW_IDX(conn)); tcp_table_compact(c, flow); } -- 2.42.0
tcp_table_compact() will move entries in the connection/flow table to keep it compact when other entries are removed. The moved entries need not have the same type as the flow removed, so it needs to be able to handle moving any type of flow. Therefore, move it to flow.c rather than being purportedly TCP specific. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- flow.c | 39 +++++++++++++++++++++++++++++++++++++++ flow.h | 2 ++ tcp.c | 46 ++++------------------------------------------ tcp_conn.h | 3 ++- tcp_splice.c | 2 +- 5 files changed, 48 insertions(+), 44 deletions(-) diff --git a/flow.c b/flow.c index a9bb8f5..0fff119 100644 --- a/flow.c +++ b/flow.c @@ -25,3 +25,42 @@ const char *flow_type_str[] = { /* Global Flow Table */ union flow flowtab[FLOW_MAX]; + +/** + * flow_table_compact() - Perform compaction on flow table + * @c: Execution context + * @hole: Pointer to recently closed flow + */ +void flow_table_compact(struct ctx *c, union flow *hole) +{ + union flow *from; + + if (FLOW_IDX(hole) == --c->flow_count) { + debug("flow: table compaction: maximum index was %u (%p)", + FLOW_IDX(hole), (void *)hole); + memset(hole, 0, sizeof(*hole)); + return; + } + + from = flowtab + c->flow_count; + memcpy(hole, from, sizeof(*hole)); + + switch (from->f.type) { + case FLOW_TCP: + tcp_tap_conn_update(c, &from->tcp, &hole->tcp); + break; + case FLOW_TCP_SPLICE: + tcp_splice_conn_update(c, &hole->tcp_splice); + break; + default: + die("Unexpected %s in tcp_table_compact()", + FLOW_TYPE(&from->f)); + } + + debug("flow: table compaction (%s): old index %u, new index %u, " + "from: %p, to: %p", + FLOW_TYPE(&from->f), FLOW_IDX(from), FLOW_IDX(hole), + (void *)from, (void *)hole); + + memset(from, 0, sizeof(*from)); +} diff --git a/flow.h b/flow.h index 57289a8..9f49195 100644 --- a/flow.h +++ b/flow.h @@ -41,4 +41,6 @@ struct flow_common { union flow; +void flow_table_compact(struct ctx *c, union flow *hole); + #endif /* FLOW_H */ diff --git a/tcp.c b/tcp.c index b2ea95e..ba11887 100644 --- a/tcp.c +++ b/tcp.c @@ -1244,8 +1244,9 @@ static void tcp_hash_remove(const struct ctx *c, * @old: Old location of tcp_tap_conn * @new: New location of tcp_tap_conn */ -static void tcp_tap_conn_update(const struct ctx *c, struct tcp_tap_conn *old, - struct tcp_tap_conn *new) +void tcp_tap_conn_update(const struct ctx *c, struct tcp_tap_conn *old, + struct tcp_tap_conn *new) + { struct tcp_tap_conn *entry, *prev = NULL; int b = tcp_conn_hash(c, old); @@ -1297,45 +1298,6 @@ static struct tcp_tap_conn *tcp_hash_lookup(const struct ctx *c, return NULL; } -/** - * tcp_table_compact() - Perform compaction on connection table - * @c: Execution context - * @hole: Pointer to recently closed connection - */ -void tcp_table_compact(struct ctx *c, union flow *hole) -{ - union flow *from; - - if (FLOW_IDX(hole) == --c->flow_count) { - debug("TCP: table compaction: maximum index was %u (%p)", - FLOW_IDX(hole), (void *)hole); - memset(hole, 0, sizeof(*hole)); - return; - } - - from = flowtab + c->flow_count; - memcpy(hole, from, sizeof(*hole)); - - switch (from->f.type) { - case FLOW_TCP: - tcp_tap_conn_update(c, &from->tcp, &hole->tcp); - break; - case FLOW_TCP_SPLICE: - tcp_splice_conn_update(c, &hole->tcp_splice); - break; - default: - die("Unexpected %s in tcp_table_compact()", - FLOW_TYPE(&from->f)); - } - - debug("TCP: table compaction (%s): old index %u, new index %u, " - "from: %p, to: %p", - FLOW_TYPE(&from->f), FLOW_IDX(from), FLOW_IDX(hole), - (void *)from, (void *)hole); - - memset(from, 0, sizeof(*from)); -} - /** * tcp_conn_destroy() - Close sockets, trigger hash table removal and compaction * @c: Execution context @@ -1350,7 +1312,7 @@ static void tcp_conn_destroy(struct ctx *c, union flow *flow) close(conn->timer); tcp_hash_remove(c, conn); - tcp_table_compact(c, flow); + flow_table_compact(c, flow); } static void tcp_rst_do(struct ctx *c, struct tcp_tap_conn *conn); diff --git a/tcp_conn.h b/tcp_conn.h index 5a9376e..3900305 100644 --- a/tcp_conn.h +++ b/tcp_conn.h @@ -157,8 +157,9 @@ struct tcp_splice_conn { extern int init_sock_pool4 [TCP_SOCK_POOL_SIZE]; extern int init_sock_pool6 [TCP_SOCK_POOL_SIZE]; +void tcp_tap_conn_update(const struct ctx *c, struct tcp_tap_conn *old, + struct tcp_tap_conn *new); void tcp_splice_conn_update(const struct ctx *c, struct tcp_splice_conn *new); -void tcp_table_compact(struct ctx *c, union flow *hole); void tcp_splice_destroy(struct ctx *c, union flow *flow); void tcp_splice_timer(struct ctx *c, union flow *flow); int tcp_conn_pool_sock(int pool[]); diff --git a/tcp_splice.c b/tcp_splice.c index 3955417..212fe16 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -283,7 +283,7 @@ void tcp_splice_destroy(struct ctx *c, union flow *flow) conn->flags = 0; debug("TCP (spliced): index %u, CLOSED", FLOW_IDX(conn)); - tcp_table_compact(c, flow); + flow_table_compact(c, flow); } /** -- 2.42.0
Most of the messages logged by the TCP code (be they errors, debug or trace messages) are related to a specific connection / flow. We're fairly consistent about prefixing these with the type of connection and the connection / flow index. However there are a few places where we put the index later in the message or omit it entirely. The template with the prefix is also a little bulky to carry around for every message, particularly for spliced connections. To help keep this consistent, introduce some helpers to log messages linked to a specific flow. It takes the flow as a parameter and adds a uniform prefix to each message. This makes things slightly neater now, but more importantly will help keep formatting consistent as we add more things to the flow table. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- flow.c | 21 ++++++++++++++ flow.h | 14 +++++++++ tcp.c | 81 ++++++++++++++++++++++++---------------------------- tcp_splice.c | 61 +++++++++++++++++---------------------- 4 files changed, 99 insertions(+), 78 deletions(-) diff --git a/flow.c b/flow.c index 0fff119..cb2cf62 100644 --- a/flow.c +++ b/flow.c @@ -64,3 +64,24 @@ void flow_table_compact(struct ctx *c, union flow *hole) memset(from, 0, sizeof(*from)); } + +/** flow_log_ - Log flow-related message + * @f: flow the message is related to + * @pri: Log priority + * @fmt: Format string + * @...: printf-arguments + * + * @fmt must include an initial "%s" to expand to the prefix we generate + * (typically added by the flow_log() macro). + */ +void flow_log_(const struct flow_common *f, int pri, const char *fmt, ...) +{ + char msg[BUFSIZ]; + va_list args; + + va_start(args, fmt); + (void)vsnprintf(msg, sizeof(msg), fmt, args); + va_end(args); + + logmsg(pri, "Flow %u (%s): %s", flow_idx(f), FLOW_TYPE(f), msg); +} diff --git a/flow.h b/flow.h index 9f49195..b6da516 100644 --- a/flow.h +++ b/flow.h @@ -43,4 +43,18 @@ union flow; void flow_table_compact(struct ctx *c, union flow *hole); +void flow_log_(const struct flow_common *f, int pri, const char *fmt, ...) + __attribute__((format(printf, 3, 4))); + +#define flow_log(f_, pri, ...) flow_log_(&(f_)->f, (pri), __VA_ARGS__) + +#define flow_dbg(f, ...) flow_log((f), LOG_DEBUG, __VA_ARGS__) +#define flow_err(f, ...) flow_log((f), LOG_ERR, __VA_ARGS__) + +#define flow_trace(f, ...) \ + do { \ + if (log_trace) \ + flow_dbg((f), __VA_ARGS__); \ + } while (0) + #endif /* FLOW_H */ diff --git a/tcp.c b/tcp.c index ba11887..994f70b 100644 --- a/tcp.c +++ b/tcp.c @@ -624,7 +624,7 @@ static void conn_flag_do(const struct ctx *c, struct tcp_tap_conn *conn, unsigned long flag); #define conn_flag(c, conn, flag) \ do { \ - trace("TCP: flag at %s:%i", __func__, __LINE__); \ + flow_trace(conn, "flag at %s:%i", __func__, __LINE__); \ conn_flag_do(c, conn, flag); \ } while (0) @@ -695,7 +695,8 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) fd = timerfd_create(CLOCK_MONOTONIC, 0); if (fd == -1 || fd > FD_REF_MAX) { - debug("TCP: failed to get timer: %s", strerror(errno)); + flow_dbg(conn, "failed to get timer: %s", + strerror(errno)); if (fd > -1) close(fd); conn->timer = -1; @@ -704,7 +705,8 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) conn->timer = fd; if (epoll_ctl(c->epollfd, EPOLL_CTL_ADD, conn->timer, &ev)) { - debug("TCP: failed to add timer: %s", strerror(errno)); + flow_dbg(conn, "failed to add timer: %s", + strerror(errno)); close(conn->timer); conn->timer = -1; return; @@ -724,8 +726,8 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) it.it_value.tv_sec = ACT_TIMEOUT; } - debug("TCP: index %u, timer expires in %lu.%03lus", FLOW_IDX(conn), - it.it_value.tv_sec, it.it_value.tv_nsec / 1000 / 1000); + flow_dbg(conn, "timer expires in %lu.%03lus", it.it_value.tv_sec, + it.it_value.tv_nsec / 1000 / 1000); timerfd_settime(conn->timer, 0, &it, NULL); } @@ -746,10 +748,8 @@ static void conn_flag_do(const struct ctx *c, struct tcp_tap_conn *conn, return; conn->flags &= flag; - if (flag_index >= 0) { - debug("TCP: index %u: %s dropped", FLOW_IDX(conn), - tcp_flag_str[flag_index]); - } + if (flag_index >= 0) + flow_dbg(conn, "%s dropped", tcp_flag_str[flag_index]); } else { int flag_index = fls(flag); @@ -767,10 +767,8 @@ static void conn_flag_do(const struct ctx *c, struct tcp_tap_conn *conn, } conn->flags |= flag; - if (flag_index >= 0) { - debug("TCP: index %u: %s", FLOW_IDX(conn), - tcp_flag_str[flag_index]); - } + if (flag_index >= 0) + flow_dbg(conn, "%s", tcp_flag_str[flag_index]); } if (flag == STALLED || flag == ~STALLED) @@ -817,15 +815,14 @@ static void conn_event_do(const struct ctx *c, struct tcp_tap_conn *conn, if (conn->flags & ACTIVE_CLOSE) new += 5; - if (prev != new) { - debug("TCP: index %u, %s: %s -> %s", FLOW_IDX(conn), - num == -1 ? "CLOSED" : tcp_event_str[num], - prev == -1 ? "CLOSED" : tcp_state_str[prev], - (new == -1 || num == -1) ? "CLOSED" : tcp_state_str[new]); - } else { - debug("TCP: index %u, %s", FLOW_IDX(conn), - num == -1 ? "CLOSED" : tcp_event_str[num]); - } + if (prev != new) + flow_dbg(conn, "%s: %s -> %s", + num == -1 ? "CLOSED" : tcp_event_str[num], + prev == -1 ? "CLOSED" : tcp_state_str[prev], + (new == -1 || num == -1) ? "CLOSED" : tcp_state_str[new]); + else + flow_dbg(conn, "%s", + num == -1 ? "CLOSED" : tcp_event_str[num]); if ((event == TAP_FIN_RCVD) && !(conn->events & SOCK_FIN_RCVD)) conn_flag(c, conn, ACTIVE_CLOSE); @@ -838,7 +835,7 @@ static void conn_event_do(const struct ctx *c, struct tcp_tap_conn *conn, #define conn_event(c, conn, event) \ do { \ - trace("TCP: event at %s:%i", __func__, __LINE__); \ + flow_trace(conn, "event at %s:%i", __func__, __LINE__); \ conn_event_do(c, conn, event); \ } while (0) @@ -1206,9 +1203,8 @@ static void tcp_hash_insert(const struct ctx *c, struct tcp_tap_conn *conn) conn->next_index = tc_hash[b] ? FLOW_IDX(tc_hash[b]) : -1U; tc_hash[b] = conn; - debug("TCP: hash table insert: index %u, sock %i, bucket: %i, next: " - "%p", FLOW_IDX(conn), conn->sock, b, - (void *)conn_at_idx(conn->next_index)); + flow_dbg(conn, "hash table insert: sock %i, bucket: %i, next: %p", + conn->sock, b, (void *)conn_at_idx(conn->next_index)); } /** @@ -1233,9 +1229,9 @@ static void tcp_hash_remove(const struct ctx *c, } } - debug("TCP: hash table remove: index %u, sock %i, bucket: %i, new: %p", - FLOW_IDX(conn), conn->sock, b, - (void *)(prev ? conn_at_idx(prev->next_index) : tc_hash[b])); + flow_dbg(conn, "hash table remove: sock %i, bucket: %i, new: %p", + conn->sock, b, + (void *)(prev ? conn_at_idx(prev->next_index) : tc_hash[b])); } /** @@ -1318,8 +1314,7 @@ static void tcp_conn_destroy(struct ctx *c, union flow *flow) static void tcp_rst_do(struct ctx *c, struct tcp_tap_conn *conn); #define tcp_rst(c, conn) \ do { \ - debug("TCP: index %u, reset at %s:%i", FLOW_IDX(conn), \ - __func__, __LINE__); \ + flow_dbg((conn), "TCP reset at %s:%i", __func__, __LINE__); \ tcp_rst_do(c, conn); \ } while (0) @@ -1998,7 +1993,7 @@ static void tcp_conn_from_tap(struct ctx *c, mss = tcp_conn_tap_mss(conn, opts, optlen); if (setsockopt(s, SOL_TCP, TCP_MAXSEG, &mss, sizeof(mss))) - trace("TCP: failed to set TCP_MAXSEG on socket %i", s); + flow_trace(conn, "failed to set TCP_MAXSEG on socket %i", s); MSS_SET(conn, mss); tcp_get_tap_ws(conn, opts, optlen); @@ -2159,8 +2154,8 @@ static int tcp_data_from_sock(struct ctx *c, struct tcp_tap_conn *conn) if (SEQ_LT(already_sent, 0)) { /* RFC 761, section 2.1. */ - trace("TCP: ACK sequence gap: ACK for %u, sent: %u", - conn->seq_ack_from_tap, conn->seq_to_tap); + flow_trace(conn, "ACK sequence gap: ACK for %u, sent: %u", + conn->seq_ack_from_tap, conn->seq_to_tap); conn->seq_to_tap = conn->seq_ack_from_tap; already_sent = 0; } @@ -2392,8 +2387,9 @@ static int tcp_data_from_tap(struct ctx *c, struct tcp_tap_conn *conn, tcp_tap_window_update(conn, max_ack_seq_wnd); if (retr) { - trace("TCP: fast re-transmit, ACK: %u, previous sequence: %u", - max_ack_seq, conn->seq_to_tap); + flow_trace(conn, + "fast re-transmit, ACK: %u, previous sequence: %u", + max_ack_seq, conn->seq_to_tap); conn->seq_ack_from_tap = max_ack_seq; conn->seq_to_tap = max_ack_seq; tcp_data_from_sock(c, conn); @@ -2542,7 +2538,7 @@ int tcp_tap_handler(struct ctx *c, uint8_t pif, int af, return 1; } - trace("TCP: packet length %lu from tap for index %u", len, FLOW_IDX(conn)); + flow_trace(conn, "packet length %lu from tap", len); if (th->rst) { conn_event(c, conn, CLOSED); @@ -2782,17 +2778,16 @@ void tcp_timer_handler(struct ctx *c, union epoll_ref ref) tcp_timer_ctl(c, conn); } else if (conn->flags & ACK_FROM_TAP_DUE) { if (!(conn->events & ESTABLISHED)) { - debug("TCP: index %u, handshake timeout", FLOW_IDX(conn)); + flow_dbg(conn, "handshake timeout"); tcp_rst(c, conn); } else if (CONN_HAS(conn, SOCK_FIN_SENT | TAP_FIN_ACKED)) { - debug("TCP: index %u, FIN timeout", FLOW_IDX(conn)); + flow_dbg(conn, "FIN timeout"); tcp_rst(c, conn); } else if (conn->retrans == TCP_MAX_RETRANS) { - debug("TCP: index %u, retransmissions count exceeded", - FLOW_IDX(conn)); + flow_dbg(conn, "retransmissions count exceeded"); tcp_rst(c, conn); } else { - debug("TCP: index %u, ACK timeout, retry", FLOW_IDX(conn)); + flow_dbg(conn, "ACK timeout, retry"); conn->retrans++; conn->seq_to_tap = conn->seq_ack_from_tap; tcp_data_from_sock(c, conn); @@ -2810,7 +2805,7 @@ void tcp_timer_handler(struct ctx *c, union epoll_ref ref) */ timerfd_settime(conn->timer, 0, &new, &old); if (old.it_value.tv_sec == ACT_TIMEOUT) { - debug("TCP: index %u, activity timeout", FLOW_IDX(conn)); + flow_dbg(conn, "activity timeout"); tcp_rst(c, conn); } } diff --git a/tcp_splice.c b/tcp_splice.c index 212fe16..28b91e1 100644 --- a/tcp_splice.c +++ b/tcp_splice.c @@ -139,8 +139,7 @@ static int tcp_splice_epoll_ctl(const struct ctx *c, if (epoll_ctl(c->epollfd, m, conn->s[0], &ev[0]) || epoll_ctl(c->epollfd, m, conn->s[1], &ev[1])) { int ret = -errno; - err("TCP (spliced): index %u, ERROR on epoll_ctl(): %s", - FLOW_IDX(conn), strerror(errno)); + flow_err(conn, "ERROR on epoll_ctl(): %s", strerror(errno)); return ret; } @@ -165,10 +164,9 @@ static void conn_flag_do(const struct ctx *c, struct tcp_splice_conn *conn, return; conn->flags &= flag; - if (flag_index >= 0) { - debug("TCP (spliced): index %u: %s dropped", FLOW_IDX(conn), - tcp_splice_flag_str[flag_index]); - } + if (flag_index >= 0) + flow_dbg(conn, "%s dropped", + tcp_splice_flag_str[flag_index]); } else { int flag_index = fls(flag); @@ -176,10 +174,8 @@ static void conn_flag_do(const struct ctx *c, struct tcp_splice_conn *conn, return; conn->flags |= flag; - if (flag_index >= 0) { - debug("TCP (spliced): index %u: %s", FLOW_IDX(conn), - tcp_splice_flag_str[flag_index]); - } + if (flag_index >= 0) + flow_dbg(conn, "%s", tcp_splice_flag_str[flag_index]); } if (flag == CLOSING) { @@ -190,8 +186,7 @@ static void conn_flag_do(const struct ctx *c, struct tcp_splice_conn *conn, #define conn_flag(c, conn, flag) \ do { \ - trace("TCP (spliced): flag at %s:%i", \ - __func__, __LINE__); \ + flow_trace(conn, "flag at %s:%i", __func__, __LINE__); \ conn_flag_do(c, conn, flag); \ } while (0) @@ -211,10 +206,8 @@ static void conn_event_do(const struct ctx *c, struct tcp_splice_conn *conn, return; conn->events &= event; - if (flag_index >= 0) { - debug("TCP (spliced): index %u, ~%s", FLOW_IDX(conn), - tcp_splice_event_str[flag_index]); - } + if (flag_index >= 0) + flow_dbg(conn, "~%s", tcp_splice_event_str[flag_index]); } else { int flag_index = fls(event); @@ -222,10 +215,8 @@ static void conn_event_do(const struct ctx *c, struct tcp_splice_conn *conn, return; conn->events |= event; - if (flag_index >= 0) { - debug("TCP (spliced): index %u, %s", FLOW_IDX(conn), - tcp_splice_event_str[flag_index]); - } + if (flag_index >= 0) + flow_dbg(conn, "%s", tcp_splice_event_str[flag_index]); } if (tcp_splice_epoll_ctl(c, conn)) @@ -234,8 +225,7 @@ static void conn_event_do(const struct ctx *c, struct tcp_splice_conn *conn, #define conn_event(c, conn, event) \ do { \ - trace("TCP (spliced): event at %s:%i", \ - __func__, __LINE__); \ + flow_trace(conn, "event at %s:%i",__func__, __LINE__); \ conn_event_do(c, conn, event); \ } while (0) @@ -281,7 +271,7 @@ void tcp_splice_destroy(struct ctx *c, union flow *flow) conn->events = SPLICE_CLOSED; conn->flags = 0; - debug("TCP (spliced): index %u, CLOSED", FLOW_IDX(conn)); + flow_dbg(conn, "CLOSED"); flow_table_compact(c, flow); } @@ -314,16 +304,17 @@ static int tcp_splice_connect_finish(const struct ctx *c, if (conn->pipe[side][0] < 0) { if (pipe2(conn->pipe[side], O_NONBLOCK | O_CLOEXEC)) { - err("TCP (spliced): cannot create %d->%d pipe: %s", - side, !side, strerror(errno)); + flow_err(conn, "cannot create %d->%d pipe: %s", + side, !side, strerror(errno)); conn_flag(c, conn, CLOSING); return -EIO; } if (fcntl(conn->pipe[side][0], F_SETPIPE_SZ, c->tcp.pipe_size)) { - trace("TCP (spliced): cannot set %d->%d pipe size to %lu", - side, !side, c->tcp.pipe_size); + flow_trace(conn, + "cannot set %d->%d pipe size to %lu", + side, !side, c->tcp.pipe_size); } } } @@ -363,8 +354,8 @@ static int tcp_splice_connect(const struct ctx *c, struct tcp_splice_conn *conn, if (setsockopt(conn->s[1], SOL_TCP, TCP_QUICKACK, &((int){ 1 }), sizeof(int))) { - trace("TCP (spliced): failed to set TCP_QUICKACK on socket %i", - conn->s[1]); + flow_trace(conn, "failed to set TCP_QUICKACK on socket %i", + conn->s[1]); } if (CONN_V6(conn)) { @@ -475,7 +466,7 @@ bool tcp_splice_conn_from_sock(const struct ctx *c, } if (setsockopt(s, SOL_TCP, TCP_QUICKACK, &((int){ 1 }), sizeof(int))) - trace("TCP (spliced): failed to set TCP_QUICKACK on %i", s); + flow_trace(conn, "failed to set TCP_QUICKACK on %i", s); conn->f.type = FLOW_TCP_SPLICE; conn->s[0] = s; @@ -555,7 +546,7 @@ retry: readlen = splice(conn->s[fromside], NULL, conn->pipe[fromside][1], NULL, c->tcp.pipe_size, SPLICE_F_MOVE | SPLICE_F_NONBLOCK); - trace("TCP (spliced): %li from read-side call", readlen); + flow_trace(conn, "%li from read-side call", readlen); if (readlen < 0) { if (errno == EINTR) goto retry; @@ -581,8 +572,8 @@ eintr: written = splice(conn->pipe[fromside][0], NULL, conn->s[!fromside], NULL, to_write, SPLICE_F_MOVE | more | SPLICE_F_NONBLOCK); - trace("TCP (spliced): %li from write-side call (passed %lu)", - written, to_write); + flow_trace(conn, "%li from write-side call (passed %lu)", + written, to_write); /* Most common case: skip updating counters. */ if (readlen > 0 && readlen == written) { @@ -794,8 +785,8 @@ void tcp_splice_timer(struct ctx *c, union flow *flow) if ((conn->flags & set) && !(conn->flags & act)) { if (setsockopt(conn->s[side], SOL_SOCKET, SO_RCVLOWAT, &((int){ 1 }), sizeof(int))) { - trace("TCP (spliced): can't set SO_RCVLOWAT on " - "%i", conn->s[side]); + flow_trace(conn, "can't set SO_RCVLOWAT on %d", + conn->s[side]); } conn_flag(c, conn, ~set); } -- 2.42.0
On Thu, 23 Nov 2023 13:36:29 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Most of the messages logged by the TCP code (be they errors, debug or trace messages) are related to a specific connection / flow. We're fairly consistent about prefixing these with the type of connection and the connection / flow index. However there are a few places where we put the index later in the message or omit it entirely. The template with the prefix is also a little bulky to carry around for every message, particularly for spliced connections. To help keep this consistent, introduce some helpers to log messages linked to a specific flow. It takes the flow as a parameter and adds a uniform prefix to each message. This makes things slightly neater now, but more importantly will help keep formatting consistent as we add more things to the flow table. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- flow.c | 21 ++++++++++++++ flow.h | 14 +++++++++ tcp.c | 81 ++++++++++++++++++++++++---------------------------- tcp_splice.c | 61 +++++++++++++++++---------------------- 4 files changed, 99 insertions(+), 78 deletions(-) diff --git a/flow.c b/flow.c index 0fff119..cb2cf62 100644 --- a/flow.c +++ b/flow.c @@ -64,3 +64,24 @@ void flow_table_compact(struct ctx *c, union flow *hole) memset(from, 0, sizeof(*from)); } + +/** flow_log_ - Log flow-related message + * @f: flow the message is related to + * @pri: Log priority + * @fmt: Format string + * @...: printf-arguments + * + * @fmt must include an initial "%s" to expand to the prefix we generate + * (typically added by the flow_log() macro).I don't understand how it does that -- flow_log() seems to just pass __VA_ARGS__ through?+ */ +void flow_log_(const struct flow_common *f, int pri, const char *fmt, ...) +{ + char msg[BUFSIZ]; + va_list args; + + va_start(args, fmt); + (void)vsnprintf(msg, sizeof(msg), fmt, args); + va_end(args); + + logmsg(pri, "Flow %u (%s): %s", flow_idx(f), FLOW_TYPE(f), msg); +} diff --git a/flow.h b/flow.h index 9f49195..b6da516 100644 --- a/flow.h +++ b/flow.h @@ -43,4 +43,18 @@ union flow; void flow_table_compact(struct ctx *c, union flow *hole); +void flow_log_(const struct flow_common *f, int pri, const char *fmt, ...) + __attribute__((format(printf, 3, 4))); + +#define flow_log(f_, pri, ...) flow_log_(&(f_)->f, (pri), __VA_ARGS__) + +#define flow_dbg(f, ...) flow_log((f), LOG_DEBUG, __VA_ARGS__) +#define flow_err(f, ...) flow_log((f), LOG_ERR, __VA_ARGS__) + +#define flow_trace(f, ...) \ + do { \ + if (log_trace) \ + flow_dbg((f), __VA_ARGS__); \ + } while (0) + #endif /* FLOW_H */-- Stefano
On Thu, Nov 23, 2023 at 07:58:45AM +0100, Stefano Brivio wrote:On Thu, 23 Nov 2023 13:36:29 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Oops, that's a leftover from an earlier version. I formatting the prefix first, then adding it to the message, rather than the message first and adding it to the prefix. I was attempting to avoid a fixed size buffer for the message, but it just turned out weird and troublesome. Plus vlogmsg() has a fixed sized buffer anyway, so why bother. Anyway, I've removed the stale comment. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibsonMost of the messages logged by the TCP code (be they errors, debug or trace messages) are related to a specific connection / flow. We're fairly consistent about prefixing these with the type of connection and the connection / flow index. However there are a few places where we put the index later in the message or omit it entirely. The template with the prefix is also a little bulky to carry around for every message, particularly for spliced connections. To help keep this consistent, introduce some helpers to log messages linked to a specific flow. It takes the flow as a parameter and adds a uniform prefix to each message. This makes things slightly neater now, but more importantly will help keep formatting consistent as we add more things to the flow table. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- flow.c | 21 ++++++++++++++ flow.h | 14 +++++++++ tcp.c | 81 ++++++++++++++++++++++++---------------------------- tcp_splice.c | 61 +++++++++++++++++---------------------- 4 files changed, 99 insertions(+), 78 deletions(-) diff --git a/flow.c b/flow.c index 0fff119..cb2cf62 100644 --- a/flow.c +++ b/flow.c @@ -64,3 +64,24 @@ void flow_table_compact(struct ctx *c, union flow *hole) memset(from, 0, sizeof(*from)); } + +/** flow_log_ - Log flow-related message + * @f: flow the message is related to + * @pri: Log priority + * @fmt: Format string + * @...: printf-arguments + * + * @fmt must include an initial "%s" to expand to the prefix we generate + * (typically added by the flow_log() macro).I don't understand how it does that -- flow_log() seems to just pass __VA_ARGS__ through?
On Thu, Nov 23, 2023 at 01:36:23PM +1100, David Gibson wrote:Here's my latest revision of some of the basics of the flow table. So far it's basically just a renaming of the existing TCP connection table. It's used for some new logging helpers, but otherwise doesn't really function any differently. However, this subset of the flow table work no longer bloats flow/connection entries over a single cache line. That removes the most prominent drawback of earlier revisions, meaning I think this series is ready for merge now. Doing so will mean the later series making more substantive changes to the flow behaviour are simpler.Since posting this, I've come up with some small revisions. So, please don't apply as is. I'll respin once you review incorporating both your feedback and the changes I already made.David Gibson (6): test: Make handling of shell prompts with escapes a little more reliable flow, tcp: Generalise connection types flow, tcp: Move TCP connection table to unified flow table flow, tcp: Consolidate flow pointer<->index helpers flow: Make unified version of flow table compaction flow, tcp: Add logging helpers for connection related messages Makefile | 14 +-- flow.c | 87 +++++++++++++++++++ flow.h | 60 +++++++++++++ flow_table.h | 45 ++++++++++ passt.h | 3 + tcp.c | 232 +++++++++++++++++++++++--------------------------- tcp.h | 5 -- tcp_conn.h | 46 +++------- tcp_splice.c | 88 +++++++++---------- test/lib/term | 6 +- 10 files changed, 361 insertions(+), 225 deletions(-) create mode 100644 flow.c create mode 100644 flow.h create mode 100644 flow_table.h-- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson