Calculating with a variable outside of its bounds in C_问答_开发者

If I make a calculation with a variable where an intermediate part of the calculation goes higher then the bounds of that variable type, is there any hazard that some platforms may not like?

This is an example of what I'm asking:

int a, b;
a=30000;
b=(a*32000)/开发者_如何转开发32767;

I have compiled this, and it does give the correct answer of 29297 (well, within truncating error, anyway). But the part that worries me is that 30,000*32,000 = 960,000,000, which is a 30-bit number, and thus cannot be stored in a 16-bit int. The end result is well within the bounds of an int, but I was expecting that whatever working part of memory would have the same size allocated as the largest source variables did, so an overflow error would occur.

This is just a small example to show my problem, I am trying to avoid using floating points by making the fraction be a fraction of the max amount able to be stored in that variable (in this case, a signed integer, so 32767 on the positive side), because the embedded system I'm using I believe does not have an FPU.

So how do most processors handle calculations out of the bounds of the source and destination variables?

On a 16-bit compiler/CPU, you can (almost) plan on that code giving incorrect results. This is a bit sad, since nearly every CPU (that has a multiply instruction at all) will produce and store the intermediate result, but no C compiler (of which I'm aware) will normally use it (and if you made a and b unsigned, it wouldn't be allowed to use it).

You have a few choices to deal with this. One is to write small muldiv function in assembly language that does the multiplication (preserving the high word) then the division on that, and finally returns the value to C when it's been reduced back into range.

Another option is to do the math on unsigned integers, which at least allow you to figure out when a problem occurred. Unfortunately, none of the choices is what I'd call particularly appealing though...

As far as I know, most if not all processors will hold results for a word * word multiplication in a double word -- meaning, an 8 bit * 8 bit is stored in a 16-bit register(s) on an 8-bit processor, a 32-bit * 32 bit operation is stored in a 64-bit register(s) on a 32-bit machine. (At least, that's how it's been on all the embedded microcontrollers I've used)

If that weren't the case, the processor would be severely crippled in the sense of only allowing half-word * half-word multiplication.

AFAIK this kind of thing is formally "undefined". You have to do the algebra necessary to prevent overflow. That's always your first choice. Numeric stability is no accident, it requires some care in deciding when and how to do division and multiplication.

Or, you have to guarantee that you'll use an intermediate result buffer that's big enough.

Using a large intermediate buffer is what some C compilers do anyway. The language, however, doesn't make any guarantees.

So, to be sure that it works, most folks do something like this.

short a= 30000;
int temp= a;
int temp2= (a*32000)/32767;
// here you can check for errors; if temp2 > 32767, you have overflow.
short b= a;

Signed integer overflow is undefined behavior.

Almost any implementation you could possibly meet will wrap around on integer overflow, because (a) everyone uses 2's complement, in which arithmetic operations are bitwise identical for signed and unsigned types of the same size, and (b) wraparound is the defined behavior of unsigned types in C.

So, on an implementation with a 16 bit int, I would expect the result 0 for your calculation (and that is the result that it must have if you'd used an unsigned 16 bit int). But I'd code against the possibility it might throw a hardware exception, explode, etc.

Note that if you do the calculation with two 16 bit short variables on a machine with a 32 bit int, then you will generally get the "right" answer 29297, because the intermediate value (a*32000) is an int, and only gets truncated back to short at the end. I say "generally" because converting an out-of-bounds integer value to a signed integer type either gives an unspecified result or else raises a signal. But again, any implementation you'll encounter in polite company just takes a modulus.

Are you sure your compiler has 16 bit integers? On most systems nowadays, ints are 32 bits. Another possible reason you aren't getting an error is that some compilers will recognize that it can compute something like this at compile time and will do so.

If you are really concerned that you will end up with overflow, you can sometimes reorder or factor the formula differently so that no intermediate terms will overflow. In your example that would be hard to do since all of your terms are near the limit of a 16 bit value. Do you need the number to be exactly right, or can you approximate? If you can, you can do something like this:

int a, b;
a=30000;
//b=(a*32000)/32767 ~= a * (32000/32768) = a *(125/128)
b = (a / 128) * 125 // if a=30000, b = 29250 - about 0.16% error

Another option would be to use larger sized types for intermediate terms. If your compiler had 16 bit ints and 32 bit longs, you could do something like this:

int a, b;
a=30000;
b=((long)a*32000L)/32767L;

Really, there's no set answer for how to handle overflow. You need to evaluate each case on its own and decide what the best solution is.

Your compiler and target processor both have to do with the sizes of the various data types. Compilers will usually promote variables to the largest easy to work with size during calculations and then convert the results whatever size is needed for an assignment at the end. There's also C rules that govern promoting to sizes which are more difficult to work with for some calculations. If you are compiling for an AVR, which has 8 bit registers but defines an int to be 16 bits, many calculations end up using more registers than you might think that they need because of this promotion and the fact that constant numbers in your code have to be thought of as being int or unsigned int unless the compiler can prove to itself that this won't effect the outcome of the calculations. Try rewriting your code with various different sizes of integers (short, int, long, long long) and see how that goes. You may also want to write a simple program that prints out the sizeof( ) of the standard predefined types. If you need to worry about the sizes of your integer variables and/or the intermediate results of your calculations then you should include and use things like uint32_t and int64_t for your declarations and type casting.