Does IEEE int->float conversion commute with + and *?_问答_开发者

Does IEEE int->float conversion commute with + and *?

开发者 https://www.devze.com 2023-02-06 18:07 出处：网络

Suppose that TOFLOAT represents the operation of converting/coercing a value having some integer type INT to one having some (range-compatible[1]) floating point type FLOAT, according to the IEEE standards. Does this operation commute with addition and multiplication? In other words, if x an开发者_StackOverflowd y are arbitrary values of type INT, does the IEEE standard guarantee that the following equalities will always evaluate to true?:

  TOFLOAT(x) + TOFLOAT(y) == TOFLOAT(x+y)
  TOFLOAT(x) * TOFLOAT(y) == TOFLOAT(x*y)

Thanks!

~kj

[1] by "range-compatible" I mean that the every value of type INT fits within the range of values representable as type FLOAT; this qualification is probably unnecessary for IEEE types.

No, neither of these holds. For a simple counterexample for the first one, with a 64-bit (signed or unsigned) integer type, and the usual binary64 IEEE double precision type, consider the case where x = 2**53 + 1 and y = 2. Then under the IEEE 754 rules, assuming the usual default rounding mode of round-half-to-even, TOFLOAT(x) + TOFLOAT(y) will be 2**53 + 2, while TOFLOAT(x + y) will be 2**53 + 4. Counterexamples for the multiplication case should be equally easy to find.

EDIT: For the multiplication, a counterexample is given by x = 2**53 + 1 and y = 3.

I think both hold, provided the operations do not overflow and you are converting eg. 32-bit integers into double precision IEEE. I could be wrong however, since I don't have the IEEE standard at my disposal.

Indeed, 32-bit integers are exactly representable in double precision IEEE 754 floating point numbers and arithmetic operations are guaranteed to be exact within 1 ULP. Since you have (strictly) more than 32 bit mantissa precision, this should hold.