I would like to have the closest number below 1.0 as a floating point. By reading wikipedia's article on IEEE-754 I have managed to find out that the binary representation for 1.0 is 3FF0000000000000
, so the closest double value is actually 0x3FEFFFFFFFFFFFFF
.
The only way I know of to initialize a double with this binary data is this:
double a;
*((unsigned*)(&a) + 1) = 0x开发者_如何学运维3FEFFFFF;
*((unsigned*)(&a) + 0) = 0xFFFFFFFF;
Which is rather cumbersome to use.
Is there any better way to define this double number, if possible as a constant?
Hexadecimal float and double literals do exist. The syntax is 0x1.(mantissa)p(exponent in decimal) In your case the syntax would be
double x = 0x1.fffffffffffffp-1
It's not safe, but something like:
double a;
*(reinterpret_cast<uint64_t *>(&a)) = 0x3FEFFFFFFFFFFFFFL;
However, this relies on a particular endianness of floating-point numbers on your system, so don't do this!
Instead, just put DBL_EPSILON
in <cfloat>
(or as pointed out in another answer, std::numeric_limits<double>::epsilon()
) to good use.
#include <iostream>
#include <iomanip>
#include <limits>
using namespace std;
int main()
{
double const x = 1.0 - numeric_limits< double >::epsilon();
cout
<< setprecision( numeric_limits< double >::digits10 + 1 ) << fixed << x
<< endl;
}
If you make a bit_cast
and use fixed-width integer types, it can be done safely:
template <typename R, typename T>
R bit_cast(const T& pValue)
{
// static assert R and T are POD types
// reinterpret_cast is implementation defined,
// but likely does what you expect
return reinterpret_cast<const R&>(pValue);
}
const uint64_t target = 0x3FEFFFFFFFFFFFFFL;
double result = bit_cast<double>(target);
Though you can probably just subtract epsilon
from it.
It's a little archaic, but you can use a union
.
Assuming a long long
and a double
are both 8 bytes long on your system:
typedef union { long long a; double b } my_union;
int main()
{
my_union c;
c.b = 1.0;
c.a--;
std::cout << "Double value is " << c.b << std::endl;
std::cout << "Long long value is " << c.a << std::endl;
}
Here you don't need to know ahead of time what the bit representation of 1.0 is.
This 0x1.fffffffffffffp-1
syntax is great, but only in C99 or C++17.
But there is a workaround, no (pointer-)casting, no UB/IB, just simple math.
double x = (double)0x1fffffffffffff / (1LL << 53);
If I need a Pi, and Pi(double) is 0x1.921fb54442d18p1 in hex, just write
const double PI = (double)0x1921fb54442d18 / (1LL << 51);
If your constant has large or small exponent, you could use the function exp2
instead of the shift, but exp2
is C99/C++11 ... Use pow
for rescue!
Rather than all the bit juggling, the most direct solution is to use nextafter()
from math.h
. Thus:
#include <math.h>
double a = nextafter(1.0, 0.0);
Read this as: the next floating-point value after 1.0
in the direction of 0.0
; an almost direct encoding of "the closest number below 1.0" from the original question.
https://godbolt.org/z/MTY4v4exz
typedef union { long long a; double b; } my_union;
int main()
{
my_union c;
c.b = 1.0;
c.a--;
std::cout << "Double value is " << c.b << std::endl;
std::cout << "Long long value is " << c.a << std::endl;
}
精彩评论