Re: [PATCH 3/4] util, lineread, tap: Overflow checks on long signed sums and subtractions

28 Jun 2024

On Fri, 28 Jun 2024 09:55:18 +0200
Stefano Brivio  wrote:
...
On Fri, 28 Jun 2024 17:11:43 +1000
David Gibson  wrote:
...
On Thu, Jun 27, 2024 at 09:55:46AM +0200, Stefano Brivio wrote:
...
On Thu, 27 Jun 2024 11:13:14 +1000
David Gibson  wrote:
...
On Thu, Jun 27, 2024 at 01:45:35AM +0200, Stefano Brivio wrote:
...
Potential sum and subtraction overflows were reported by Coverity in a
few places where we use size_t and ssize_t types.
Strictly speaking, those are not false positives even if I don't see a
way to trigger those: for instance, if we receive up to n bytes from a
socket up, the return value from recv() is already constrained and
can't overflow for the usage we make in tap_handler_passt().
Actually, I think they are false positives.  In a bunch of cases the
reasoning for that does rely on assuming the kernel will never return
a value greater than the buffer size for read(), write() or similar.
No, that was exactly my point: return values are constrained by the
kernel, but a static checker doesn't necessarily have to assume a kernel
that's properly functioning.
Well, yes.
...
In general, static checkers do, especially if POSIX has a clear
definition of a system call, and for what matters to us, they should.
Right, that's the assumption I was working under.
...
But here Coverity is ignoring that, and I'm not sure we should call it
a false positive. It's kind of arbitrary really. I think Coverity in
these cases just prefers to "blindly" apply CERT C INT32-C locally,
which is not necessarily a bad choice, because "false positives" are
not so much of a nuisance.
...
So possibly just ASSERT()ing that would suffice.
In some cases yes, but as we have built-ins in gcc and Clang that aim
at keeping the cost of the checks down by, quoting gcc documentation,
using "hardware instructions to implement these built-in functions where
possible", and they already implement the operation, open-coding our own
checks for assertions looks redundant and might result in slower
code.
It's not runtime cost I'm concerned about, I'm sure that's trivial.
What does concern me is:
   - the overflow checking functions look like line noise
Sure, that bothers me as well. I'm not sure: would comments improve that? Say:
/* n += len; */
  if (sadd_overflow(n, len, &n))
      return;
or a different macro? Or an ASSERT() on sadd_overflow() itself, so that
it doesn't really look "inline"?
...that doesn't "work" either, which is rather puzzling. With this
diff, on top of this series:

diff --git a/lineread.c b/lineread.c
index f72a14e..82f262a 100644
--- a/lineread.c
+++ b/lineread.c
@@ -77,6 +77,7 @@ ssize_t lineread_get(struct lineread *lr, char **line)
 {
 	bool eof = false;
 	ssize_t line_len;
+	ssize_t test;
 
 	while ((line_len = peek_line(lr, eof)) < 0) {
 		ssize_t rc;
@@ -103,8 +104,10 @@ ssize_t lineread_get(struct lineread *lr, char
**line) 
 		if (rc == 0)
 			eof = true;
-		else if (sadd_overflow(lr->count, rc, &lr->count))
-			return -ERANGE;
+		else {
+			ASSERT(!sadd_overflow(lr->count, rc, &test));
+			lr->count += rc;
+		}
 	}
 
 	*line = lr->buf + lr->next_line;

I get these "events":

  Called function "read(lr->fd, lr->buf + lr->next_line + lr->count, 8192L - lr->next_line - lr->count)", and a possible return value may be less than zero.
  Assigning: "rc" = "read(lr->fd, lr->buf + lr->next_line + lr->count, 8192L - lr->next_line - lr->count)".

around read(), then:

  Condition "!!__builtin_add_overflow(lr->count, rc, &test)", taking false branch.

and at the sum:

  The expression "lr->count" is considered to have possibly overflowed.

More attempts below:
...
...
- it introduces extra error paths which make it harder to see what
     the code's doing
We already have equivalent error paths in all the cases I'm touching
here, though.
...
- it doesn't really save reasoning about what ranges things can
     have, because we need to know where to put them, unless we put
     them on every single operation, which makes the above points much
     worse
I wouldn't put them on every single operation. Again, it's clear that
in all these cases, we *can't* hit those overflows. Not even in the
write_remainder() case
I would just continue with the only possible reasonable approach, which
is, reasoning about which ranges things can have, and then test things
with static checkers, and if they have false positives, well, a bit of
cruft here and there (be it suppressions or redundant checks) is a
very reasonable price to pay for what they offer, I think.
...
I prefer checking that the syscall return values are within the bounds
we expect, rather than checking for later overflows, because as well
as the above, if we ever do get weird values out of the syscalls it
should show up the problem as close to the source as possible.
...
...
...
In any case, take care of those by adding two functions that
explicitly check for overflows in sums and subtractions of long signed
values, and using them.
Signed-off-by: Stefano Brivio 
---
 lineread.c |  5 +++--
 tap.c      | 21 +++++++++++++++------
 util.c     |  7 +++++--
 util.h     | 46 +++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 68 insertions(+), 11 deletions(-)
diff --git a/lineread.c b/lineread.c
index 0387f4a..12f2d24 100644
--- a/lineread.c
+++ b/lineread.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
#include "lineread.h"
 #include "util.h"
@@ -102,8 +103,8 @@ ssize_t lineread_get(struct lineread *lr, char **line)
if (rc == 0)
      	eof = true;
-		else
-			lr->count += rc;
From the construction of the read, lr->count + rc can never exceed
LINEREAD_BUFFER_SIZE - lr->next_line, so this can't overflow.
Sure. But, especially as package maintainer, in this case I prefer to
have a useless check than carrying suppressions around.
Right, but my hope is that if we ASSERT() or otherwise check that
property of the read return value, then that will allow Coverity to
reason out the rest without needing an explicit suppression.
Well, I tried (of course...), along with dozens of other attempts, but
it didn't work for any of these cases. For example, here, other than an
ASSERT() on the return value of the read(), I also tried stuff like:
ASSERT(lr->count < SSIZE_MAX && rc < SSIZE_MAX)
  lr->count += (size_t)rc;
but no, it doesn't work. I think what Coverity wants to see is the sum
tested in infinite precision, first.
I also tried to check the return value of the read() call in the most
obvious way, that is, just after read():

		ASSERT(rc <= LINEREAD_BUFFER_SIZE - lr->next_line - lr->count);

but lr->count is still reported as potentially overflowed. I think the
reason is that the subtraction in the ASSERT() itself is prone to
overflow.

Anyway, given that both rc and lr->count are ssize_t, the condition of
this ASSERT() can't overflow:

		ASSERT((size_t)rc + lr->count < SSIZE_MAX);

but at this point Coverity says:

  The check "(size_t)rc + lr->count < 9223372036854775807UL", which appears to be a guard against integer overflow, is not a useful guard because it is either always true, or never true. This taints "rc".

And simpler, rather obvious stuff like:

		ASSERT(rc <= LINEREAD_BUFFER_SIZE);
		ASSERT(lr->count <= LINEREAD_BUFFER_SIZE);

doesn't work either.

I think this proves that Coverity really isn't happy unless the sum
itself happens in infinite precision (not even size_t domain with
ssize_t operands is enough).

So, well, I can report the false positive, unless you have further
ideas to check.

Meanwhile, we can either try to make this patch more acceptable, or
I'll suppress checks (downstream) as needed.
...
...
...
...
...
+		else if (saddl_overflow(lr->count, rc, &lr->count))
+			return -ERANGE;
  }
*line = lr->buf + lr->next_line;
diff --git a/tap.c b/tap.c
index ec994a2..7f8c26d 100644
--- a/tap.c
+++ b/tap.c
@@ -1031,7 +1031,11 @@ redo:
       */
      if (l2len > n) {
      	rem = recv(c->fd_tap, p + n, l2len - n, 0);
-			if ((n += rem) != l2len)
Similarly, rem <= l2len - n, and therefore n + rem <= l2len.
Same here.
...
...
+
+			if (saddl_overflow(n, rem, &n))
+				return;
+
+			if (n != l2len)
      		return;
      }
@@ -1046,7 +1050,9 @@ redo:
next:
      p += l2len;
-		n -= l2len;
We already checked that l2len <= n, so this one can't overflow either.
Same here.
...
Not sure why Coverity can't see that itself, though :/.  Possibly it
doesn't understand gotos well enough to see that the only goto next is
after that check.
It sees that, that's the path it takes in reporting a potential
overflow here. I think here, again, it's just blindly requesting
INT32-C from CERT C rules, locally.
Hmmm... in each of these cases the reasoning to show that there can't
be an overflow isn't very complicated - at least once you put in the
assumption about the syscall return values, which is why I'm hoping
asserting that fact will let Coverity sort it out.
It doesn't.
...
If it's reasoning more locally than the function, I can't see how that
won't devolve into anything other than "never use signed arithmetic in
C, at all, ever".
We do a lot of signed arithmetic, and yet, just those five cases are
problematic. I guess there's something peculiar with system calls
return values, or with ssize_t / size_t, even. Maybe I could try to
show Coverity a re-definition of those types... other than that I'm
pretty much out of ideas.
-- 
Stefano