On Thu, Jan 02, 2025 at 11:00:04PM +0100, Stefano Brivio wrote:On Thu, 2 Jan 2025 13:15:40 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:Oh.. I think I see the confusion. dhcpv6_opt() trying to get data that's not in the packet is not an issue. dhcpv6_opt() trying to get data that is (theoretically) within the packet, but *not* in the buffer indicates something very bad has happened. The former is exactly one check, every other one is the second class - trying to separate those cases is the purpose of the later "different severities" patch. The difficulty is that passing func==NULL to indicate the "try" case doesn't work if we want to still give useful errors for the serious cases: we need the function name for those too. I had been considering printing occasional trace level messages for the ok case an acceptable tradeoff for not suppressing the messages which are serious. But I see your case or that being too confusing when debugging. I did have a draft where I used an explicit boolean flag to enable/disable the non-serious errors, but gave up on it for simplicity. I'll look into a way to continue suppressing the non-serious error here. Maybe moving the (single) non-serious error case message into the caller with a wrapper.On Wed, Jan 01, 2025 at 10:54:37PM +0100, Stefano Brivio wrote:But dhcpv6_opt() trying to get data that doesn't exist is *not* an issue, including not a serious one, so if I'm debugging something with --trace and I see one of these messages I'll shout at "memory" or "packet" badness and waste time thinking it's an actual issue.On Fri, 20 Dec 2024 19:35:30 +1100 David Gibson <david(a)gibson.dropbear.id.au> wrote:I'm not following your argument here. It's exactly because (most of) the message indicate serious issues that I don't want to suppress them. I don't know what you mean by "validating identity associations".Two places in the code use the packet_get_try() variant on packet_get(). The difference is that packet_get_try() passes a NULL 'func' to packet_get_do(), which suppresses log messages. The places we use this are ones where we expect to sometimes have a failure retreiving the packet range, even in normal cases. So, suppressing the log messages seems like it makes sense, except: 1) It suppresses log messages on all errors. We (somewhat) expect to hit the case where the requested range is not within the received packet. However, it also suppresses message in cases where the requested packet index doesn't exist, the requested range has a nonsensical length or doesn't lie in even the right vague part of memory. None of those latter cases are expected, and the message would be helpful if we ever actually hit them. 2) The suppressed messages aren't actually that disruptive. For the case in ip.c, we'd log only if we run out of IPv6 packet before reaching a (non-option) L4 header. That shouldn't be the case in normal operation so getting a message (at trave level) is not unreasonable. For the case in dhcpv6.c we do suppress a message every time we look for but don't find a specific DHCPv6 option. That can happen in perfectly ok cases, but, again these are trace() level and DHCPv6 transactions aren't that common. Suppressing the message for this case doesn't seem worth the drawbacks of (1).The reason why I implemented packet_get_try() is that the messages from packet_get_do() indicate serious issues, and if I'm debugging something by looking at traces it's not great to have messages indicating that we hit a serious issue while we're simply validating identity associations.Validating identity associations (IA_NA, IA_TA, RFC 3315) is what dhcpv6_ia_notonlink() does. That's the most common case where we'll routinely call dhcpv6_opt() to fetch data which isn't there.Ok.Right, that's also my preference, but as above I compromised on this to simplify preserving the error cases that do matter.Yes, I get that, and: - I would be happy if that one were *not* reported as failureIt's not about the amount of logged messages, it's about the type of message being logged and the distracting noise possibly resulting in a substantial time waste. About 1): dhcpv6_opt() always picks pool index 0, and the base offset was already checked by the caller.Right, but dhcpv6_opt() is called in a loop, that only stops when it returns NULL. So, by construction the last call to dhcpv6_opt(), which terminates the loop, _will_ have a failing call to packet_get(). At this point - at least assuming a correctly constructed packet - the offset will point to just past the last option, which should be exactly at the end of the packet.- the calls before that one should always be enough to check if we have an actual issue with the packetYes, in this case I think that's correct.Not really sure what you mean by the data level, here.There, I used packet_get_try() because a missing option or payload doesn't indicate a bad packet at the _data_ level.In ipv6_l4hdr(), the index was already validated by a previous call to packet_get(), and the starting offset as well.Not AFAICT, the initial packet_get just validates the basic IPv6 header. The calls to packet_get_try() in the loop validate additional IP options. I don't think it will ever fail on a well-constructed packet, but it could on a bogus (or truncated) packet, where the nexthdr field indicates an option that's actually missing. This is kind of my point: it will only trip on a badly constructed packet, in which case I don't think we want to suppress messages.On the other hand, it's bad at the network level anyway, because option 59 *must* be there otherwise (I just realised), so while I'd still prefer another wording of the warning (not mentioning packet/buffer ranges... something more network-y), I would be fine with it.That sounds like another argument for moving the message for the "requested range is outside packet" case into the caller. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson