[PATCH v3 0/9] Take care of clang-tidy warnings with LLVM >= 16
So I started hitting some clang-tidy warnings with LLVM 16, some looked bogus, so I upgraded to LLVM 19, and... I got even more. This series takes care of them in different ways. v3: - split 5/8 into 5/9 and 6/9: in the first, drop O_APPEND so that we can have a helper to open any output file we need, and in the second one, always use O_CLOEXEC for pcap file (and use the new helper, now that we can) v2: - make snprintf_check() return and set errno on failure, in 2/8 - add missing err_perror() calls on clock_gettime() failures in 6/8 - drop all explicit integer assignments in enum udp_iov_idx in 7/8 Stefano Brivio (9): Makefile: Exclude qrap.c from clang-tidy checks treewide: Comply with CERT C rule ERR33-C for snprintf() treewide: Silence cert-err33-c clang-tidy warnings for fprintf() Makefile: Disable readability-math-missing-parentheses clang-tidy check log: Don't use O_APPEND at all treewide: Suppress clang-tidy warning if we already use O_CLOEXEC or if we can't treewide: Address cert-err33-c clang-tidy warnings for clock and timer functions udp: Take care of cert-int09-c clang-tidy warning for enum udp_iov_idx util: Don't use errno after a successful call in __daemon() Makefile | 13 ++++++++--- arch.c | 6 ++++- conf.c | 62 +++++++++++++++++++++++++++---------------------- log.c | 15 ++++-------- passt.c | 9 ++++--- pasta.c | 11 ++++++--- pcap.c | 24 ++++++++++--------- tap.c | 5 ++-- tcp.c | 12 +++++++--- udp.c | 10 ++++---- util.c | 71 +++++++++++++++++++++++++++++++++++--------------------- util.h | 7 +++++- 12 files changed, 148 insertions(+), 97 deletions(-) -- 2.43.0
We'll deprecate qrap(1) soon, and warnings reported by clang-tidy as
of LLVM versions 16 and later would need a bunch of changes there to
be addressed, mostly around CERT C rule ERR33-C and checking return
code from snprintf().
It makes no sense to fix warnings in qrap just for the sake of it, so
officially declare the bitrotting season open.
Signed-off-by: Stefano Brivio
clang-tidy, starting from LLVM version 16, up to at least LLVM version
19, now checks that we detect and handle errors for snprintf() as
requested by CERT C rule ERR33-C. These warnings were logged with LLVM
version 19.1.2 (at least Debian and Fedora match):
/home/sbrivio/passt/arch.c:43:3: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
43 | snprintf(new_path, PATH_MAX + sizeof(".avx2"), "%s.avx2", exe);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/arch.c:43:3: note: cast the expression to void to silence this warning
/home/sbrivio/passt/conf.c:577:4: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
577 | snprintf(netns, PATH_MAX, "/proc/%ld/ns/net", pidval);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/conf.c:577:4: note: cast the expression to void to silence this warning
/home/sbrivio/passt/conf.c:579:5: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
579 | snprintf(userns, PATH_MAX, "/proc/%ld/ns/user",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
580 | pidval);
| ~~~~~~~
/home/sbrivio/passt/conf.c:579:5: note: cast the expression to void to silence this warning
/home/sbrivio/passt/pasta.c:105:2: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
105 | snprintf(ns, PATH_MAX, "/proc/%i/ns/net", pasta_child_pid);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/pasta.c:105:2: note: cast the expression to void to silence this warning
/home/sbrivio/passt/pasta.c:242:2: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
242 | snprintf(uidmap, BUFSIZ, "0 %u 1", uid);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/pasta.c:242:2: note: cast the expression to void to silence this warning
/home/sbrivio/passt/pasta.c:243:2: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
243 | snprintf(gidmap, BUFSIZ, "0 %u 1", gid);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/pasta.c:243:2: note: cast the expression to void to silence this warning
/home/sbrivio/passt/tap.c:1155:4: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
1155 | snprintf(path, UNIX_PATH_MAX - 1, UNIX_SOCK_PATH, i);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/tap.c:1155:4: note: cast the expression to void to silence this warning
Don't silence the warnings as they might actually have some merit. Add
an snprintf_check() function, instead, checking that we're not
truncating messages while printing to buffers, and terminate if the
check fails.
Signed-off-by: Stefano Brivio
We use fprintf() to print to standard output or standard error
streams. If something gets truncated or there's an output error, we
don't really want to try and report that, and at the same time it's
not abnormal behaviour upon which we should terminate, either.
Just silence the warning with an ugly FPRINTF() variadic macro casting
the fprintf() expressions to void.
Signed-off-by: Stefano Brivio
With clang-tidy and LLVM 19:
/home/sbrivio/passt/conf.c:1218:29: error: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
1218 | const char *octet = str + 3 * i;
| ^~~~~~
| ( )
/home/sbrivio/passt/ndp.c:285:18: error: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
285 | .len = 1 + 2 * n,
| ^~~~~~
| ( )
/home/sbrivio/passt/ndp.c:329:23: error: '%' has higher precedence than '-'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
329 | memset(ptr, 0, 8 - dns_s_len % 8); /* padding */
| ^~~~~~~~~~~~~~
| ( )
/home/sbrivio/passt/pcap.c:131:20: error: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
131 | pcap_frame(iov + i * frame_parts, frame_parts, offset, &now);
| ^~~~~~~~~~~~~~~~
| ( )
/home/sbrivio/passt/util.c:216:10: error: '/' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
216 | return (a->tv_nsec + 1000000000 - b->tv_nsec) / 1000 +
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| ( )
/home/sbrivio/passt/util.c:217:10: error: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
217 | (a->tv_sec - b->tv_sec - 1) * 1000000;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| ( )
/home/sbrivio/passt/util.c:220:9: error: '/' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
220 | return (a->tv_nsec - b->tv_nsec) / 1000 +
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| ( )
/home/sbrivio/passt/util.c:221:9: error: '*' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
221 | (a->tv_sec - b->tv_sec) * 1000000;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| ( )
/home/sbrivio/passt/util.c:545:32: error: '/' has higher precedence than '+'; add parentheses to explicitly specify the order of operations [readability-math-missing-parentheses,-warnings-as-errors]
545 | return clone(fn, stack_area + stack_size / 2, flags, arg);
| ^~~~~~~~~~~~~~~
| ( )
Just... no.
Signed-off-by: Stefano Brivio
We open the log file with O_APPEND, but switch it off before seeking,
and turn it back on afterwards.
We never seek when O_APPEND is on, so we don't actually need it, as
its only function is to override the offset for writes so that they
are always performed at the end regardless of the current offset
(which is at the end anyway, for us).
Signed-off-by: Stefano Brivio
On Mon, Oct 28, 2024 at 11:00:40AM +0100, Stefano Brivio wrote:
We open the log file with O_APPEND, but switch it off before seeking, and turn it back on afterwards.
We never seek when O_APPEND is on, so we don't actually need it, as its only function is to override the offset for writes so that they are always performed at the end regardless of the current offset (which is at the end anyway, for us).
Sorry, this sounded fishy to me on the call, but I figured I was just missing something. But looking at this the reasoning doesn't make sense to me. We don't seek with O_APPEND, but we do write(), which is exactly where it matters. AIUI the point of O_APPEND is that if you have multiple processes writing to the same file, they won't clobber each others writes because of a stale file pointer. That's usually not _necessary_ for us as such, but it's perhaps valuable since it reduces the likelihood of data loss if somehow you do get two instances logging to the same file. Of course the rotation process *can* clobber things (which is exactly why I was always a bit sceptical of this "in place" rotation, not that we really have other options). But that at least is occasional, unlike each log write. Maybe it's not worth having O_APPEND, but I don't think the reasoning above makes any real sense.
Signed-off-by: Stefano Brivio
--- log.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/log.c b/log.c index 6932885..dd25862 100644 --- a/log.c +++ b/log.c @@ -204,9 +204,6 @@ out: */ static int logfile_rotate(int fd, const struct timespec *now) { - if (fcntl(fd, F_SETFL, O_RDWR /* Drop O_APPEND: explicit lseek() */)) - return -errno; - #ifdef FALLOC_FL_COLLAPSE_RANGE /* Only for Linux >= 3.15, extent-based ext4 or XFS, glibc >= 2.18 */ if (!fallocate(fd, FALLOC_FL_COLLAPSE_RANGE, 0, log_cut_size)) @@ -215,9 +212,6 @@ static int logfile_rotate(int fd, const struct timespec *now) #endif logfile_rotate_move(fd, now);
- if (fcntl(fd, F_SETFL, O_RDWR | O_APPEND)) - return -errno; - return 0; }
@@ -416,7 +410,7 @@ void logfile_init(const char *name, const char *path, size_t size) if (readlink("/proc/self/exe", exe, PATH_MAX - 1) < 0) die_perror("Failed to read own /proc/self/exe link");
- log_file = open(path, O_CREAT | O_TRUNC | O_APPEND | O_RDWR | O_CLOEXEC, + log_file = open(path, O_CREAT | O_TRUNC | O_RDWR | O_CLOEXEC, S_IRUSR | S_IWUSR); if (log_file == -1) die_perror("Couldn't open log file %s", path);
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Tue, 29 Oct 2024 15:20:56 +1100
David Gibson
On Mon, Oct 28, 2024 at 11:00:40AM +0100, Stefano Brivio wrote:
We open the log file with O_APPEND, but switch it off before seeking, and turn it back on afterwards.
We never seek when O_APPEND is on, so we don't actually need it, as its only function is to override the offset for writes so that they are always performed at the end regardless of the current offset (which is at the end anyway, for us).
Sorry, this sounded fishy to me on the call, but I figured I was just missing something. But looking at this the reasoning doesn't make sense to me.
We don't seek with O_APPEND, but we do write(), which is exactly where it matters. AIUI the point of O_APPEND is that if you have multiple processes writing to the same file, they won't clobber each others writes because of a stale file pointer.
That's not the reason why I originally added it though: it was there
because I thought I would lseek() to do the rotation and possibly end
up with the cursor somewhere before the end. Then restart writing, and
the write would happen in the middle of the file:
$ cat append.c
#include
That's usually not _necessary_ for us as such, but it's perhaps valuable since it reduces the likelihood of data loss if somehow you do get two instances logging to the same file.
The result will be completely unreadable anyway, so I don't think it matters for us.
Of course the rotation process *can* clobber things (which is exactly why I was always a bit sceptical of this "in place" rotation, not that we really have other options).
Why would it clobber things? logfile_rotate_fallocate() and logfile_rotate_move() take care of cutting cleanly at a line boundary, and tests check that. -- Stefano
On Tue, Oct 29, 2024 at 09:48:50AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 15:20:56 +1100 David Gibson
wrote: On Mon, Oct 28, 2024 at 11:00:40AM +0100, Stefano Brivio wrote:
We open the log file with O_APPEND, but switch it off before seeking, and turn it back on afterwards.
We never seek when O_APPEND is on, so we don't actually need it, as its only function is to override the offset for writes so that they are always performed at the end regardless of the current offset (which is at the end anyway, for us).
Sorry, this sounded fishy to me on the call, but I figured I was just missing something. But looking at this the reasoning doesn't make sense to me.
We don't seek with O_APPEND, but we do write(), which is exactly where it matters. AIUI the point of O_APPEND is that if you have multiple processes writing to the same file, they won't clobber each others writes because of a stale file pointer.
That's not the reason why I originally added it though: it was there because I thought I would lseek() to do the rotation and possibly end up with the cursor somewhere before the end. Then restart writing, and the write would happen in the middle of the file:
I don't entirely follow. I see why you disable O_APPEND across the rotation, but I'm not clear on why it's opened with O_APPEND in the first place, if it's not for the typical logging reason.
$ cat append.c #include
#include #include #include int main(int argc, char **argv) { int flags = O_CREAT | O_TRUNC | O_WRONLY | ((argc == 3) ? O_APPEND : 0); int fd = open(argv[1], flags, S_IRUSR | S_IWUSR); char buf[BUFSIZ];
memset(buf, 'a', BUFSIZ); write(fd, buf, 10); lseek(fd, 1, SEEK_SET); memset(buf, 'b', BUFSIZ); write(fd, buf, 10); write(fd, (char *){ "\n" }, 1);
return 0; } $ gcc -o append{,.c} $ ./append test append $ cat test aaaaaaaaaabbbbbbbbbb $ ./append test $ cat test abbbbbbbbbb
That's usually not _necessary_ for us as such, but it's perhaps valuable since it reduces the likelihood of data loss if somehow you do get two instances logging to the same file.
The result will be completely unreadable anyway, so I don't think it matters for us.
Not necessarily. It certainly can get garbled, but individual writes of reasonable size - such as a single log line will generally complete atomically. With a text logging format, that's not ideal but often pretty decipherable. Particularly if each writer includes a prefix identifying itself.
Of course the rotation process *can* clobber things (which is exactly why I was always a bit sceptical of this "in place" rotation, not that we really have other options).
Why would it clobber things? logfile_rotate_fallocate() and logfile_rotate_move() take care of cutting cleanly at a line boundary, and tests check that.
I mean that in the case that there are multiple writers, the rotation breaks that "no data loss, and probably readable-ish" property of O_APPEND. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Tue, 29 Oct 2024 20:32:40 +1100
David Gibson
On Tue, Oct 29, 2024 at 09:48:50AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 15:20:56 +1100 David Gibson
wrote: On Mon, Oct 28, 2024 at 11:00:40AM +0100, Stefano Brivio wrote:
We open the log file with O_APPEND, but switch it off before seeking, and turn it back on afterwards.
We never seek when O_APPEND is on, so we don't actually need it, as its only function is to override the offset for writes so that they are always performed at the end regardless of the current offset (which is at the end anyway, for us).
Sorry, this sounded fishy to me on the call, but I figured I was just missing something. But looking at this the reasoning doesn't make sense to me.
We don't seek with O_APPEND, but we do write(), which is exactly where it matters. AIUI the point of O_APPEND is that if you have multiple processes writing to the same file, they won't clobber each others writes because of a stale file pointer.
That's not the reason why I originally added it though: it was there because I thought I would lseek() to do the rotation and possibly end up with the cursor somewhere before the end. Then restart writing, and the write would happen in the middle of the file:
I don't entirely follow. I see why you disable O_APPEND across the rotation, but I'm not clear on why it's opened with O_APPEND in the first place, if it's not for the typical logging reason.
I initially opened it with O_APPEND because I _thought_ I would set the offset to a possibly inconsistent value around the rotation. Then I dropped O_APPEND around the rotation, forgetting about the initial reason why I added it at all. So it makes no sense to have O_APPEND at all.
$ cat append.c #include
#include #include #include int main(int argc, char **argv) { int flags = O_CREAT | O_TRUNC | O_WRONLY | ((argc == 3) ? O_APPEND : 0); int fd = open(argv[1], flags, S_IRUSR | S_IWUSR); char buf[BUFSIZ];
memset(buf, 'a', BUFSIZ); write(fd, buf, 10); lseek(fd, 1, SEEK_SET); memset(buf, 'b', BUFSIZ); write(fd, buf, 10); write(fd, (char *){ "\n" }, 1);
return 0; } $ gcc -o append{,.c} $ ./append test append $ cat test aaaaaaaaaabbbbbbbbbb $ ./append test $ cat test abbbbbbbbbb
That's usually not _necessary_ for us as such, but it's perhaps valuable since it reduces the likelihood of data loss if somehow you do get two instances logging to the same file.
The result will be completely unreadable anyway, so I don't think it matters for us.
Not necessarily. It certainly can get garbled, but individual writes of reasonable size - such as a single log line will generally complete atomically. With a text logging format, that's not ideal but often pretty decipherable. Particularly if each writer includes a prefix identifying itself.
Of course the rotation process *can* clobber things (which is exactly why I was always a bit sceptical of this "in place" rotation, not that we really have other options).
Why would it clobber things? logfile_rotate_fallocate() and logfile_rotate_move() take care of cutting cleanly at a line boundary, and tests check that.
I mean that in the case that there are multiple writers, the rotation breaks that "no data loss, and probably readable-ish" property of O_APPEND.
Ah, sure. But I think that supporting multiple writers would need more work anyway (at least adding a prefix as you mentioned). Well, anyway, if you think this might add a regression with multiple writers, I can add an extra flag to output_file_open() and keep O_APPEND for the log file. But I really struggle to see the actual use case. -- Stefano
On Tue, Oct 29, 2024 at 11:23:29AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 20:32:40 +1100 David Gibson
wrote: On Tue, Oct 29, 2024 at 09:48:50AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 15:20:56 +1100 David Gibson
wrote: On Mon, Oct 28, 2024 at 11:00:40AM +0100, Stefano Brivio wrote:
We open the log file with O_APPEND, but switch it off before seeking, and turn it back on afterwards.
We never seek when O_APPEND is on, so we don't actually need it, as its only function is to override the offset for writes so that they are always performed at the end regardless of the current offset (which is at the end anyway, for us).
Sorry, this sounded fishy to me on the call, but I figured I was just missing something. But looking at this the reasoning doesn't make sense to me.
We don't seek with O_APPEND, but we do write(), which is exactly where it matters. AIUI the point of O_APPEND is that if you have multiple processes writing to the same file, they won't clobber each others writes because of a stale file pointer.
That's not the reason why I originally added it though: it was there because I thought I would lseek() to do the rotation and possibly end up with the cursor somewhere before the end. Then restart writing, and the write would happen in the middle of the file:
I don't entirely follow. I see why you disable O_APPEND across the rotation, but I'm not clear on why it's opened with O_APPEND in the first place, if it's not for the typical logging reason.
I initially opened it with O_APPEND because I _thought_ I would set the offset to a possibly inconsistent value around the rotation.
Then I dropped O_APPEND around the rotation, forgetting about the initial reason why I added it at all. So it makes no sense to have O_APPEND at all.
Ok, that makes sense. Except that maybe there is a reason to use O_APPEND (the multiple writer thing), even if it's not the one you thought of initially. [snip]
Of course the rotation process *can* clobber things (which is exactly why I was always a bit sceptical of this "in place" rotation, not that we really have other options).
Why would it clobber things? logfile_rotate_fallocate() and logfile_rotate_move() take care of cutting cleanly at a line boundary, and tests check that.
I mean that in the case that there are multiple writers, the rotation breaks that "no data loss, and probably readable-ish" property of O_APPEND.
Ah, sure. But I think that supporting multiple writers would need more work anyway (at least adding a prefix as you mentioned).
That's fair. I wonder if it might make sense to flock() the logfile, to (somewhat) enforce that only one process uses it at a time.
Well, anyway, if you think this might add a regression with multiple writers, I can add an extra flag to output_file_open() and keep O_APPEND for the log file. But I really struggle to see the actual use case.
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
On Wed, 30 Oct 2024 13:33:43 +1100
David Gibson
On Tue, Oct 29, 2024 at 11:23:29AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 20:32:40 +1100 David Gibson
wrote: On Tue, Oct 29, 2024 at 09:48:50AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 15:20:56 +1100 David Gibson
wrote: On Mon, Oct 28, 2024 at 11:00:40AM +0100, Stefano Brivio wrote:
We open the log file with O_APPEND, but switch it off before seeking, and turn it back on afterwards.
We never seek when O_APPEND is on, so we don't actually need it, as its only function is to override the offset for writes so that they are always performed at the end regardless of the current offset (which is at the end anyway, for us).
Sorry, this sounded fishy to me on the call, but I figured I was just missing something. But looking at this the reasoning doesn't make sense to me.
We don't seek with O_APPEND, but we do write(), which is exactly where it matters. AIUI the point of O_APPEND is that if you have multiple processes writing to the same file, they won't clobber each others writes because of a stale file pointer.
That's not the reason why I originally added it though: it was there because I thought I would lseek() to do the rotation and possibly end up with the cursor somewhere before the end. Then restart writing, and the write would happen in the middle of the file:
I don't entirely follow. I see why you disable O_APPEND across the rotation, but I'm not clear on why it's opened with O_APPEND in the first place, if it's not for the typical logging reason.
I initially opened it with O_APPEND because I _thought_ I would set the offset to a possibly inconsistent value around the rotation.
Then I dropped O_APPEND around the rotation, forgetting about the initial reason why I added it at all. So it makes no sense to have O_APPEND at all.
Ok, that makes sense.
Except that maybe there is a reason to use O_APPEND (the multiple writer thing), even if it's not the one you thought of initially.
[snip]
Of course the rotation process *can* clobber things (which is exactly why I was always a bit sceptical of this "in place" rotation, not that we really have other options).
Why would it clobber things? logfile_rotate_fallocate() and logfile_rotate_move() take care of cutting cleanly at a line boundary, and tests check that.
I mean that in the case that there are multiple writers, the rotation breaks that "no data loss, and probably readable-ish" property of O_APPEND.
Ah, sure. But I think that supporting multiple writers would need more work anyway (at least adding a prefix as you mentioned).
That's fair. I wonder if it might make sense to flock() the logfile, to (somewhat) enforce that only one process uses it at a time.
...but if it kind of works for multiple writers, we shouldn't prevent that usage, right? On the other hand, I don't think we should try to make that usage all nice and supported because we would need a prefix, which, in case of a single writer, just adds noise and size. And I don't think we want to detect if there are multiple writers... So, all in all, I would choose to spend no effort and leave like it is, until somebody comes up with a use case in one direction or the other. -- Stefano
On Wed, Oct 30, 2024 at 01:27:26PM +0100, Stefano Brivio wrote:
On Wed, 30 Oct 2024 13:33:43 +1100 David Gibson
wrote: On Tue, Oct 29, 2024 at 11:23:29AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 20:32:40 +1100 David Gibson
wrote: On Tue, Oct 29, 2024 at 09:48:50AM +0100, Stefano Brivio wrote:
On Tue, 29 Oct 2024 15:20:56 +1100 David Gibson
wrote: On Mon, Oct 28, 2024 at 11:00:40AM +0100, Stefano Brivio wrote: > We open the log file with O_APPEND, but switch it off before seeking, > and turn it back on afterwards. > > We never seek when O_APPEND is on, so we don't actually need it, as > its only function is to override the offset for writes so that they > are always performed at the end regardless of the current offset > (which is at the end anyway, for us).
Sorry, this sounded fishy to me on the call, but I figured I was just missing something. But looking at this the reasoning doesn't make sense to me.
We don't seek with O_APPEND, but we do write(), which is exactly where it matters. AIUI the point of O_APPEND is that if you have multiple processes writing to the same file, they won't clobber each others writes because of a stale file pointer.
That's not the reason why I originally added it though: it was there because I thought I would lseek() to do the rotation and possibly end up with the cursor somewhere before the end. Then restart writing, and the write would happen in the middle of the file:
I don't entirely follow. I see why you disable O_APPEND across the rotation, but I'm not clear on why it's opened with O_APPEND in the first place, if it's not for the typical logging reason.
I initially opened it with O_APPEND because I _thought_ I would set the offset to a possibly inconsistent value around the rotation.
Then I dropped O_APPEND around the rotation, forgetting about the initial reason why I added it at all. So it makes no sense to have O_APPEND at all.
Ok, that makes sense.
Except that maybe there is a reason to use O_APPEND (the multiple writer thing), even if it's not the one you thought of initially.
[snip]
Of course the rotation process *can* clobber things (which is exactly why I was always a bit sceptical of this "in place" rotation, not that we really have other options).
Why would it clobber things? logfile_rotate_fallocate() and logfile_rotate_move() take care of cutting cleanly at a line boundary, and tests check that.
I mean that in the case that there are multiple writers, the rotation breaks that "no data loss, and probably readable-ish" property of O_APPEND.
Ah, sure. But I think that supporting multiple writers would need more work anyway (at least adding a prefix as you mentioned).
That's fair. I wonder if it might make sense to flock() the logfile, to (somewhat) enforce that only one process uses it at a time.
...but if it kind of works for multiple writers, we shouldn't prevent that usage, right?
On the other hand, I don't think we should try to make that usage all nice and supported because we would need a prefix, which, in case of a single writer, just adds noise and size. And I don't think we want to detect if there are multiple writers...
So, all in all, I would choose to spend no effort and leave like it is, until somebody comes up with a use case in one direction or the other.
Yeah, that's fair. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
In pcap_init(), we should always open the packet capture file with
O_CLOEXEC, even if we're not running in foreground: O_CLOEXEC means
close-on-exec, not close-on-fork.
In logfile_init() and pidfile_open(), the fact that we pass a third
'mode' argument to open() seems to confuse the android-cloexec-open
checker in LLVM versions from 16 to 19 (at least).
The checker is suggesting to add O_CLOEXEC to 'mode', and not in
'flags', where we already have it.
Add a suppression for clang-tidy and a comment, and avoid repeating
those three time by adding a new helper, output_file_open().
Signed-off-by: Stefano Brivio
On Mon, Oct 28, 2024 at 11:00:41AM +0100, Stefano Brivio wrote:
In pcap_init(), we should always open the packet capture file with O_CLOEXEC, even if we're not running in foreground: O_CLOEXEC means close-on-exec, not close-on-fork.
In logfile_init() and pidfile_open(), the fact that we pass a third 'mode' argument to open() seems to confuse the android-cloexec-open checker in LLVM versions from 16 to 19 (at least).
The checker is suggesting to add O_CLOEXEC to 'mode', and not in 'flags', where we already have it.
.. well.. the checker with the googletest package installed, anyway :/
Add a suppression for clang-tidy and a comment, and avoid repeating those three time by adding a new helper, output_file_open().
Signed-off-by: Stefano Brivio
Reviewed-by: David Gibson
--- conf.c | 3 ++- log.c | 3 +-- pcap.c | 7 ++----- util.c | 26 ++++++++++---------------- util.h | 2 +- 5 files changed, 16 insertions(+), 25 deletions(-)
diff --git a/conf.c b/conf.c index 4db7c64..b28f411 100644 --- a/conf.c +++ b/conf.c @@ -1194,7 +1194,8 @@ static void conf_open_files(struct ctx *c) if (c->mode != MODE_PASTA && c->fd_tap == -1) c->fd_tap_listen = tap_sock_unix_open(c->sock_path);
- c->pidfile_fd = pidfile_open(c->pidfile); + if (*c->pidfile && (c->pidfile_fd = output_file_open(c->pidfile) < 0)) + die_perror("Couldn't open PID file %s", c->pidfile); }
/** diff --git a/log.c b/log.c index dd25862..48db4d9 100644 --- a/log.c +++ b/log.c @@ -410,8 +410,7 @@ void logfile_init(const char *name, const char *path, size_t size) if (readlink("/proc/self/exe", exe, PATH_MAX - 1) < 0) die_perror("Failed to read own /proc/self/exe link");
- log_file = open(path, O_CREAT | O_TRUNC | O_RDWR | O_CLOEXEC, - S_IRUSR | S_IWUSR); + log_file = output_file_open(path); if (log_file == -1) die_perror("Couldn't open log file %s", path);
diff --git a/pcap.c b/pcap.c index 6ee6cdf..a07eb33 100644 --- a/pcap.c +++ b/pcap.c @@ -158,18 +158,15 @@ void pcap_iov(const struct iovec *iov, size_t iovcnt, size_t offset) */ void pcap_init(struct ctx *c) { - int flags = O_WRONLY | O_CREAT | O_TRUNC; - if (pcap_fd != -1) return;
if (!*c->pcap) return;
- flags |= c->foreground ? O_CLOEXEC : 0; - pcap_fd = open(c->pcap, flags, S_IRUSR | S_IWUSR); + pcap_fd = output_file_open(c->pcap); if (pcap_fd == -1) { - perror("open"); + err_perror("Couldn't open pcap file %s", c->pcap); return; }
diff --git a/util.c b/util.c index 9cb705e..d838b34 100644 --- a/util.c +++ b/util.c @@ -407,25 +407,19 @@ void pidfile_write(int fd, pid_t pid) }
/** - * pidfile_open() - Open PID file if needed - * @path: Path for PID file, empty string if no PID file is requested + * output_file_open() - Open file for output, if needed + * @path: Path for output file * - * Return: descriptor for PID file, -1 if path is NULL, won't return on failure + * Return: file descriptor on success, -1 on failure with errno set by open() */ -int pidfile_open(const char *path) +int output_file_open(const char *path) { - int fd; - - if (!*path) - return -1; - - if ((fd = open(path, O_CREAT | O_TRUNC | O_WRONLY | O_CLOEXEC, - S_IRUSR | S_IWUSR)) < 0) { - perror("PID file open"); - exit(EXIT_FAILURE); - } - - return fd; + /* We use O_CLOEXEC here, but clang-tidy as of LLVM 16 to 19 looks for + * it in the 'mode' argument if we have one + */ + return open(path, O_CREAT | O_TRUNC | O_WRONLY | O_CLOEXEC, + /* NOLINTNEXTLINE(android-cloexec-open) */ + S_IRUSR | S_IWUSR); }
/** diff --git a/util.h b/util.h index 4f8b768..73b4a49 100644 --- a/util.h +++ b/util.h @@ -193,7 +193,7 @@ char *line_read(char *buf, size_t len, int fd); void ns_enter(const struct ctx *c); bool ns_is_init(void); int open_in_ns(const struct ctx *c, const char *path, int flags); -int pidfile_open(const char *path); +int output_file_open(const char *path); void pidfile_write(int fd, pid_t pid); int __daemon(int pidfile_fd, int devnull_fd); int fls(unsigned long x);
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
For clock_gettime(), we shouldn't ignore errors if they happen at
initialisation phase, because something is seriously wrong and it's
not helpful if we proceed as if nothing happened.
As we're up and running, though, it's probably better to report the
error and use a stale value than to terminate altogether. Make sure
we use a zero value if we don't have a stale one somewhere.
For timerfd_gettime() and timerfd_settime() failures, just report an
error, there isn't much else we can do.
Signed-off-by: Stefano Brivio
On Mon, Oct 28, 2024 at 11:00:42AM +0100, Stefano Brivio wrote:
For clock_gettime(), we shouldn't ignore errors if they happen at initialisation phase, because something is seriously wrong and it's not helpful if we proceed as if nothing happened.
As we're up and running, though, it's probably better to report the error and use a stale value than to terminate altogether. Make sure we use a zero value if we don't have a stale one somewhere.
For timerfd_gettime() and timerfd_settime() failures, just report an error, there isn't much else we can do.
Signed-off-by: Stefano Brivio
Reviewed-by: David Gibson
--- passt.c | 9 ++++++--- pcap.c | 17 +++++++++++------ tcp.c | 12 +++++++++--- 3 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/passt.c b/passt.c index ad6f0bc..eaf231d 100644 --- a/passt.c +++ b/passt.c @@ -207,7 +207,8 @@ int main(int argc, char **argv) struct timespec now; struct sigaction sa;
- clock_gettime(CLOCK_MONOTONIC, &log_start); + if (clock_gettime(CLOCK_MONOTONIC, &log_start)) + die_perror("Failed to get CLOCK_MONOTONIC time");
arch_avx2_exec(argv);
@@ -265,7 +266,8 @@ int main(int argc, char **argv)
secret_init(&c);
- clock_gettime(CLOCK_MONOTONIC, &now); + if (clock_gettime(CLOCK_MONOTONIC, &now)) + die_perror("Failed to get CLOCK_MONOTONIC time");
flow_init();
@@ -313,7 +315,8 @@ loop: if (nfds == -1 && errno != EINTR) die_perror("epoll_wait() failed in main loop");
- clock_gettime(CLOCK_MONOTONIC, &now); + if (clock_gettime(CLOCK_MONOTONIC, &now)) + err_perror("Failed to get CLOCK_MONOTONIC time");
for (i = 0; i < nfds; i++) { union epoll_ref ref = *((union epoll_ref *)&events[i].data.u64); diff --git a/pcap.c b/pcap.c index a07eb33..7751ddc 100644 --- a/pcap.c +++ b/pcap.c @@ -100,12 +100,14 @@ static void pcap_frame(const struct iovec *iov, size_t iovcnt, void pcap(const char *pkt, size_t l2len) { struct iovec iov = { (char *)pkt, l2len }; - struct timespec now; + struct timespec now = { 0 };
if (pcap_fd == -1) return;
- clock_gettime(CLOCK_REALTIME, &now); + if (clock_gettime(CLOCK_REALTIME, &now)) + err_perror("Failed to get CLOCK_REALTIME time"); + pcap_frame(&iov, 1, 0, &now); }
@@ -119,13 +121,14 @@ void pcap(const char *pkt, size_t l2len) void pcap_multiple(const struct iovec *iov, size_t frame_parts, unsigned int n, size_t offset) { - struct timespec now; + struct timespec now = { 0 }; unsigned int i;
if (pcap_fd == -1) return;
- clock_gettime(CLOCK_REALTIME, &now); + if (clock_gettime(CLOCK_REALTIME, &now)) + err_perror("Failed to get CLOCK_REALTIME time");
for (i = 0; i < n; i++) pcap_frame(iov + i * frame_parts, frame_parts, offset, &now); @@ -143,12 +146,14 @@ void pcap_multiple(const struct iovec *iov, size_t frame_parts, unsigned int n, /* cppcheck-suppress unusedFunction */ void pcap_iov(const struct iovec *iov, size_t iovcnt, size_t offset) { - struct timespec now; + struct timespec now = { 0 };
if (pcap_fd == -1) return;
- clock_gettime(CLOCK_REALTIME, &now); + if (clock_gettime(CLOCK_REALTIME, &now)) + err_perror("Failed to get CLOCK_REALTIME time"); + pcap_frame(iov, iovcnt, offset, &now); }
diff --git a/tcp.c b/tcp.c index 0569dc6..f03243d 100644 --- a/tcp.c +++ b/tcp.c @@ -549,7 +549,8 @@ static void tcp_timer_ctl(const struct ctx *c, struct tcp_tap_conn *conn) (unsigned long long)it.it_value.tv_sec, (unsigned long long)it.it_value.tv_nsec / 1000 / 1000);
- timerfd_settime(conn->timer, 0, &it, NULL); + if (timerfd_settime(conn->timer, 0, &it, NULL)) + flow_err(conn, "failed to set timer: %s", strerror(errno)); }
/** @@ -2235,7 +2236,9 @@ void tcp_timer_handler(const struct ctx *c, union epoll_ref ref) * timer is currently armed, this event came from a previous setting, * and we just set the timer to a new point in the future: discard it. */ - timerfd_gettime(conn->timer, &check_armed); + if (timerfd_gettime(conn->timer, &check_armed)) + flow_err(conn, "failed to read timer: %s", strerror(errno)); + if (check_armed.it_value.tv_sec || check_armed.it_value.tv_nsec) return;
@@ -2273,7 +2276,10 @@ void tcp_timer_handler(const struct ctx *c, union epoll_ref ref) * case. This avoids having to preemptively reset the timer on * ~ACK_TO_TAP_DUE or ~ACK_FROM_TAP_DUE. */ - timerfd_settime(conn->timer, 0, &new, &old); + if (timerfd_settime(conn->timer, 0, &new, &old)) + flow_err(conn, "failed to set timer: %s", + strerror(errno)); + if (old.it_value.tv_sec == ACT_TIMEOUT) { flow_dbg(conn, "activity timeout"); tcp_rst(c, conn);
-- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson
/home/sbrivio/passt/udp.c:171:1: error: inital values in enum 'udp_iov_idx' are not consistent, consider explicit initialization of all, none or only the first enumerator [cert-int09-c,readability-enum-initial-value,-warnings-as-errors]
171 | enum udp_iov_idx {
| ^
172 | UDP_IOV_TAP = 0,
173 | UDP_IOV_ETH = 1,
174 | UDP_IOV_IP = 2,
175 | UDP_IOV_PAYLOAD = 3,
176 | UDP_NUM_IOVS
|
| = 4
Don't initialise any value, so that it's obvious that constants map to
unique values.
Signed-off-by: Stefano Brivio
I thought we could just set errno to 0, do a bunch of stuff, and check
that errno didn't change to infer we succeeded. But clang-tidy,
starting with LLVM 19, reports:
/home/sbrivio/passt/util.c:465:6: error: An undefined value may be read from 'errno' [clang-analyzer-unix.Errno,-warnings-as-errors]
465 | if (errno)
| ^
/usr/include/errno.h:38:16: note: expanded from macro 'errno'
38 | # define errno (*__errno_location ())
| ^~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/util.c:446:6: note: Assuming the condition is false
446 | if (pid == -1) {
| ^~~~~~~~~
/home/sbrivio/passt/util.c:446:2: note: Taking false branch
446 | if (pid == -1) {
| ^
/home/sbrivio/passt/util.c:451:6: note: Assuming 'pid' is 0
451 | if (pid) {
| ^~~
/home/sbrivio/passt/util.c:451:2: note: Taking false branch
451 | if (pid) {
| ^
/home/sbrivio/passt/util.c:463:2: note: Assuming that 'close' is successful; 'errno' becomes undefined after the call
463 | close(devnull_fd);
| ^~~~~~~~~~~~~~~~~
/home/sbrivio/passt/util.c:465:6: note: An undefined value may be read from 'errno'
465 | if (errno)
| ^
/usr/include/errno.h:38:16: note: expanded from macro 'errno'
38 | # define errno (*__errno_location ())
| ^~~~~~~~~~~~~~~~~~~~~~
And the LLVM documentation for the unix.Errno checker, 1.1.8.3
unix.Errno (C), mentions, at:
https://clang.llvm.org/docs/analyzer/checkers.html#unix-errno
that:
The C and POSIX standards often do not define if a standard library
function may change value of errno if the call does not fail.
Therefore, errno should only be used if it is known from the return
value of a function that the call has failed.
which is, somewhat surprisingly, the case for close().
Instead of using errno, check the actual return values of the calls
we issue here.
Signed-off-by: Stefano Brivio
participants (2)
-
David Gibson
-
Stefano Brivio