(How) do you handle possible integer overflows in C++ code?_问答_开发者

Every now and then, e开发者_开发知识库specially when doing 64bit builds of some code base, I notice that there are plenty of cases where integer overflows are possible. The most common case is that I do something like this:

// Creates a QPixmap out of some block of data; this function comes from library A
QPixmap createFromData( const char *data, unsigned int len );

const std::vector<char> buf = createScreenShot();
return createFromData( &buf[0], buf.size() ); // <-- warning here in 64bit builds

The thing is that std::vector::size() nicely returns a size_t (which is 8 bytes in 64bit builds) but the function happens to take an unsigned int (which is still only 4 bytes in 64bit builds). So the compiler warns correctly.

If possible, I try to fix up the signatures to use the correct types in the first place. However, I'm often hitting this problem when combining functions from different libraries which I cannot modify. Unfortunately, I often resort to some reasoning along the lines of "Okay, nobody will ever do a screenshot generating more than 4GB of data, so why bother" and just change the code to do

return createFromData( &buf[0], static_cast<unsigned int>( buf.size() ) );

So that the compiler shuts up. However, this feels really evil. So I've been considering to have some sort of runtime assertion which at least yields a nice error in the debug builds, as in:

assert( buf.size() < std::numeric_limits<unsigned int>::maximum() );

This is a bit nicer already, but I wonder: how do you deal with this sort of problem, that is: integer overflows which are "almost" impossible (in practice). I guess that means that they don't occur for you, they don't occur for QA - but they explode in the face of the customer.

If you can't fix the types (because you can't break library compatibility), and you're "confident" that the size will never get that big, you can use boost::numeric_cast in place of the static_cast. This will throw an exception if the value is too big.

Of course the surrounding code then has to do something vaguely sensible with the exception - since it's a "not expected ever to happen" condition, that might just mean shutting down cleanly. Still better than continuing with the wrong size.

The solution depends on context. In some cases, you know where the data comes from, and can exclude overflow: an int that is initialized with 0 and incremented once a second, for example, isn't going to overflow anytime in the lifetime of the machine. In such cases, you just convert (or allow the implicit conversion to do its stuff), and don't worry about it.

Another type of case is fairly similar: in your case, for example, it's probably not reasonable for a screen schot to have more data that can be represented by an int, so the conversion is also safe. Provided the data really did come from a screen shot; in such cases, the usual procedure is to validate the data on input, ensuring that it fulfills your constraints downstream, and then do no further validation.

Finally, if you have no real control over where the data is coming from, and can't validate on entry (at least not for your constraints downstream), you're stuck with using some sort of checking conversion, validating immediately at the point of conversion.

If you push a 64-bit overflowing number into a 32-bit library you open pandora's box -- undefined behaviour.

Throw an exception. Since exceptions can in general spring up arbitrarily anywhere you should have suitable code to catch it anyway. Given that, you may as well exploit it.

Error messages are unpleasant but they're better than undefined behaviour.

Such scenarios can be held in one of four ways or using a combination of them:

use right types
use static assertions
use runtime assertions
ignore until hurts

Usually the best is to use right types right until your code gets ugly and then roll in static assertions. Static assertions are much better than runtime assertions for this very purpose.

Finally, when static assertions won't work (like in your example) you use runtime assertions - yes, they get into customers' faces, but at least your program behaves predictably. Yes, customers don't like assertions - they start panic ("we have error!" in all caps), but without an assertion the program would likely misbehave and no way to easily diagnose the problem would be.

One thing just came to my mind: since I need some sort of runtime check (whether or not the value of e.g. buf.size() exceeds the range of unsigned int can only be tested at runtime), but I do not want to have a million assert() invocations everywhere, I could do something like

template <typename T, typename U>
T integer_cast( U v ) {
    assert( v < std::numeric_limits<T>::maximum() );
    return static_cast<T>( v );
}

That way, I would at least have the assertion centralized, and

return createFromData( &buf[0], integer_cast<unsigned int>( buf.size() ) );

Is a tiny bit better. Maybe I should rather throw an exception (it is quite exceptional indeed!) instead of assert'ing, to give the caller a chance to handle the situation gracefully by rolling back previous work and issueing diagnostic output or the like.