开发者

C# int stored in double "==" precision problem

开发者 https://www.devze.com 2023-01-26 23:45 出处:网络
Here is the simplified code: i开发者_Go百科nt i = 1; double a = i; double b = i; Is it guarantied that a == b is true?Yes. 32-bit integers can be represented exactly as 64-bit floating point number

Here is the simplified code:

i开发者_Go百科nt i = 1;
double a = i;
double b = i;

Is it guarantied that a == b is true?


Yes. 32-bit integers can be represented exactly as 64-bit floating point numbers.


Is it guarantied that a == b is true?

Yes. This is because you perform the same conversion twice and given its deterministic behavior you will end up with the same values regardless of the rounding problems.

We can generalize your question though, to:

Can we perform arithmetic operations on 32-bit integer values encoded in double type without precision loose?

The answer for such question is yes as well.

A short justification is that operations on mantissa bits (see http://en.wikipedia.org/wiki/Significand) are precise if it is only possible and in case of 32-bit integer values it is possible.

Longer story comes here. As long as your integer value fits in 52 bits of a fraction part called mantissa (see http://en.wikipedia.org/wiki/Double_precision) all calculations on integer values using double will behave completely OK.

This is because your number (say 173 which is 0000010101101b binary) will be represented as 1.010110100000b*2^7, which is accurate.

All operations on mantissa are straight forward as long as they fit in mantissa. Rounding on integers occurs when result of a particular operation do not fit in mantissa - eg. you would multiply 40 bits of mantissa by 40 bits of mantissa. Rounding on floating point operations additionally occur when exponents are much different. In this case even a simple addition operation can loose precision because matissas are shifted.

Back to integers encoded in double - even division operation is precise, as long as the result is integer value. So 4.0/2.0 == 8.0/4.0 is also guaranteed to be true.

The problem begins when your number is not integer. But even in this case numbers are guaranteed to be represented precisely if they are a form of x/2^y and x fits in 52 bits (eg. 3/4 5/8 345/1024). Operations on such numbers are also precise given y can be equal for both operands, so even:

123456789/1024/1024/1024/1024 == 
(23456789/1024/1024/1024/1024 +
100000000/1024/1024/1024/1024)

is guaranteed to be true.

Interesting fact is that you can perform operation on 54 bit signed integers safely. This is because you have additional bit at the beginning whose meaning is encoded by the exponent and one additional bit for a sign. Now -2^53 which would be MIN_INT in case of 54 bit signed integer does not fit the mantissa, but exponent will do the job here with mantissa full of zeros.


Yes, you can store a (32-bit) integer number in a double (64-bit floating-point number) without precision loss.

However, as soon as you perform calculations with your double, you will very likely introduce rounding errors, i.e. precision loss. These errors will likely be small enough so that they get rounded away when you cast your double value back to int — but the error is there, so be aware of it.

How it's done: See this document (IEEE Standard 754 Floating Point Numbers by Steve Hollasch) for details on how an integer can be stored as a floating-point value.

To summarise (somewhat inaccurately), a floating-point value consists of three parts: A sign bit, a "fraction" part (called the mantissa), and an "exponent" part. They're put together roughly as follows:

    value = -1 sign bit × fraction × 2 exponent

You can store the integer value in the "fraction" part of the double (which is 52 bits wide, which is more than wide enough for a 32-bit integer. The "exponent" part can just be set to 0, since it's not needed.


I opened Visual Studio, and I tested it.

Here is my code:

int i = 5;
double t = i;
double k = i;
MessageBox.Show((i == t).ToString()); //true
MessageBox.Show((k == t).ToString()); //true
i += 5;
t += 5;
k = i;
MessageBox.Show((i == t).ToString()); //true
MessageBox.Show((k == t).ToString()); //true
i += (int)Math.Round(5.6);
t += 5.6;
t = (int)Math.Round(t);
k = i;
MessageBox.Show((i == t).ToString()); //true
MessageBox.Show((k == t).ToString()); //true
i = int.MaxValue - 5438;
t = int.MaxValue - 5438;
k = i;
MessageBox.Show((i == t).ToString()); //true
MessageBox.Show((k == t).ToString()); //true
i = (int)Math.Round(double.MaxValue);
t = Math.Round(double.MaxValue);
k = i;
MessageBox.Show((i == t).ToString()); //false
MessageBox.Show((k == t).ToString()); //false
i = (int)Math.Round(double.MaxValue);
t = i;
k = i;
MessageBox.Show((i == t).ToString()); //true
MessageBox.Show((k == t).ToString()); //true

The result was two messageboxes saying true.

I guess that concludes that: Yes, you are guaranteed that it will be true.

EDIT: Extended my test a little. The only test returning false were the test at double.MaxValue, but i doubt that you will use that big numbers.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号