How to efficiently compare the sign of two floating-point values while handling negative zeros_问答_开发者

Given two floating-point numbers, I'm looking for an efficient way to check if they have the same sign, given that if any of the two values is zero (+0.0 or -0.0), they should be considered to have the same sign.

For instance,

SameSign(1.0, 2.0) should return true
SameSign(-1.0, -2.0) should return true
SameSign(-1.0, 2.0) should return false
SameSign(0.0, 1.0) should return true
SameSign(0.0, -1.0) should return true
SameSign(-0.0, 1.0) should return true
SameSign(-0.0, -1.0) should return true

A naive but correct implementation of SameSign in C++ would be:

bool SameSign(float a, float b)
{
    if (fabs(a) == 0.0f || fabs(b) == 0.0f)
        return true;

    return (a >= 0.0f) == (b >= 0.0f);
}

Assuming the IEEE floating-point model, here's a variant of SameSign that compiles to branchless code (at least with with Visual C++ 2008):

bool SameSign(float a, float b)
{
    int ia = binary_cast<int>(a);
    int ib = binary_cast<int>(b);

    int az = (ia & 0x7FFFFFFF) == 0;
    int bz = (ib & 0x7FFFFFFF) == 0;
    int ab = (ia ^ ib) >= 0;

    return (az | bz | ab) != 0;
}

with binary_cast defined as follow:

template <typename Target, typename Source>
inline Target binary_cast(开发者_JAVA技巧Source s)
{
    union
    {
        Source  m_source;
        Target  m_target;
    } u;
    u.m_source = s;
    return u.m_target;
}

I'm looking for two things:

A faster, more efficient implementation of SameSign, using bit tricks, FPU tricks or even SSE intrinsics.
An efficient extension of SameSign to three values.

Edit:

I've made some performance measurements on the three variants of SameSign (the two variants described in the original question, plus Stephen's one). Each function was run 200-400 times, on all consecutive pairs of values in an array of 101 floats filled at random with -1.0, -0.0, +0.0 and +1.0. Each measurement was repeated 2000 times and the minimum time was kept (to weed out all cache effects and system-induced slowdowns). The code was compiled with Visual C++ 2008 SP1 with maximum optimization and SSE2 code generation enabled. The measurements were done on a Core 2 Duo P8600 2.4 Ghz.

Here are the timings, not counting the overhead of fetching input values from the array, calling the function and retrieving the result (which amount to 6-7 clockticks):

Naive variant: 15 ticks
Bit magic variant: 13 ticks
Stephens's variant: 6 ticks

If you don't need to support infinities, you can just use:

inline bool SameSign(float a, float b) {
    return a*b >= 0.0f;
}

which is actually pretty fast on most modern hardware, and is completely portable. It doesn't work properly in the (zero, infinity) case however, because zero * infinity is NaN, and the comparison will return false, regardless of the signs. It will also incur a denormal stall on some hardware when a and b are both tiny.

perhaps something like:

inline bool same_sign(float a, float b) {
    return copysignf(a,b) == a;
}

see the man page for copysign for more info on what it does (also you may want to check that -0 != +0)

or possibly this if you have C99 functions

inline bool same_sign(float a, float b) {
    return signbitf(a) == signbitf(b);
}

as a side note, on gcc at least both copysign and signbit are builtin functions so they should be fast, if you want to make sure the builtin version is being used you can do __builtin_signbitf(a)

EDIT: this should also be easy to extend to the 3 value case as well (actually both of these should...)

inline bool same_sign(float a, float b, float c) {
    return copysignf(a,b) == a && copysignf(a,c) == a;
}

// trust the compiler to do common sub-expression elimination
inline bool same_sign(float a, float b, float c) {
    return signbitf(a) == signbitf(b) && signbitf(a) == signbitf(c);
}

// the manpages do not say that signbit returns 1 for negative... however
// if it does this should be good, (no branches for one thing...)
inline bool same_sign(float a, float b, float c) {
    int s = signbitf(a) + signbitf(b) + signbitf(c);
    return !s || s==3;
}

A small note on signbit: The macro returns an int and the man page states that "It returns a nonzero value if the value of x has its sign bit set." This means that the Spudd86's bool same_sign() is not guaranteed to work in case signbit returns two different non-zero int's for two different negative values.

Casting to bool first ensures a correct return value:

inline bool same_sign(float a, float b) {
    return (bool)signbitf(a) == (bool)signbitf(b);
}

How to efficiently compare the sign of two floating-point values while handling negative zeros

精彩评论

关注公众号

热门标签

图文推荐

How to efficiently compare the sign of two floating-point values while handling negative zeros

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：