On Wed, Sep 18, 2024 at 08:20:25AM +0200, Markus Armbruster wrote:David Gibson <david(a)gibson.dropbear.id.au> writes:Basically, yes. They're just slurped up by a script to generate our seccomp() filter.write_remainder() steps through the buffers in an IO vector writing out everything past a certain byte offset. However, on each iteration it rescans the buffer from the beginning to find out where we're up to. With an unfortunate set of write sizes this could lead to quadratic behaviour. In an even less likely set of circumstances (total vector length > maximum size_t) the 'skip' variable could overflow. This is one factor in a longstanding Coverity error we've seen (although I still can't figure out the remainder of its complaint). Rework write_remainder() to always work out our new position in the vector relative to our old/current position, rather than starting from the beginning each time. As a bonus this seems to fix the Coverity error. Signed-off-by: David Gibson <david(a)gibson.dropbear.id.au> --- util.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/util.c b/util.c index 7db7c2e7..43e5f5ec 100644 --- a/util.c +++ b/util.c @@ -615,28 +615,30 @@ int write_all_buf(int fd, const void *buf, size_t len) * * Return: 0 on success, -1 on error (with errno set) * - * #syscalls write writev + * #syscalls writevI'm not familiar with this annotation. I guess it's system calls used in directly this function, not including the ones used in callees.Yes.*/ int write_remainder(int fd, const struct iovec *iov, size_t iovcnt, size_t skip) { - size_t offset, i; + size_t i = 0, offset; - while ((i = iov_skip_bytes(iov, iovcnt, skip, &offset)) < iovcnt) { + while ((i += iov_skip_bytes(iov + i, iovcnt - i, skip, &offset)) < iovcnt) { ssize_t rc; if (offset) { - rc = write(fd, (char *)iov[i].iov_base + offset, - iov[i].iov_len - offset); - } else { - rc = writev(fd, &iov[i], iovcnt - i); + /* Write the remainder of the partially written buffer */ + if (write_all_buf(fd, (char *)iov[i].iov_base + offset, + iov[i].iov_len - offset) < 0) + return -1; + i++; } + /* Write as much of the remaining whole buffers as we can */ + rc = writev(fd, &iov[i], iovcnt - i);Last argument can be zero now. Okay.Actually, they're not rare - we nearly always pass a non-zero initial skip value. But write_remainder() itself is somewhat rare, and making a copy of the iov is kind of structurally awkward in passt (we don't allocate). -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibsonif (rc < 0) return -1; - skip += rc; + skip = rc; } - return 0; }We could use try to save system calls by only using writev(), with a suitably adjusted copy of @iov. But non-zero @offset should be rare, so I guess it's not worth the bother.