I need the most efficient way (in cpu cycles) to determine if two numbers have the same/different sign. But the catch is if either number is zero I need to be able to distinguish it from numbers with same/different signs (ie. zero is treated as a "third" sign). The following code is similar to what I need, but the return values can be anything as long as there are only three distinct return values.
int foo(int x, int y) {
if (x * y > 0) return 1;
if (x * y < 0) return -1;
开发者_如何转开发 return 0;
}
For my specific problem, the values are in the range [-6, 6] and X is guaranteed to not be 0. I found a solution to find if two numbers have the same sign, and altered it to get the following solution.
return y? (((x^y) >= 0)? 1 : -1) : 0;
There should be some bitops/comparisons that give faster results than using multiplication, branching, comparisons.
How about:
int foo(int x,int y)
{
// As suggested by Luther Blissett below in the comments.
// Increased the size of the array to 16x16.
// This allows for simpler optimization for the compiler
// Also use +8 rather +6 in the hopes that compiler optimization will be easier
// you never know (there may be some fancy trick.
static int sign[16][16] = {
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 1, 1, 1, 1, 1, 1, 1, 1, 0, -1, -1, -1, -1, -1, -1, -1},
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
{ -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1, 1, 1},
{ -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1, 1, 1},
{ -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1, 1, 1},
{ -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1, 1, 1},
{ -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1, 1, 1},
{ -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1, 1, 1},
{ -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 1, 1, 1, 1, 1, 1}
};
return sign[x+8][y+8];
}
This should be fast as there is no branching that will stall the processor.
Using g++ -O3 -S:
__Z3fooii:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
movl 12(%ebp), %edx
popl %ebp
sall $4, %eax
addl %edx, %eax
movl _ZZ3fooiiE4sign+544(,%eax,4), %eax
ret
Here is another version (with ugly, non-portable bit manipulation tricks):
int foo(int x, int y) {
return ((x^y) >> 4) - ((x^(-y)) >> 4);
}
Some explanations:
((x^y) >> 4)
is -1 if exactly one of x and y is negative, otherwise it is 0.((x^(-y)) >> 4)
is -1 if exactly one of x and -y is negative, otherwise it is 0.- If x > 0 and y > 0, the result will be 0 - (-1) = 1.
- If x < 0 and y < 0, the result will be 0 - (-1) = 1.
- If x > 0 and y = 0, the result will be 0 - 0 = 0.
- If x < 0 and y = 0, the result will be (-1) - (-1) = 0.
- If x > 0 and y < 0, the result will be (-1) - 0 = -1.
- If x < 0 and y > 0, the result will be (-1) - 0 = -1.
Assumes two's complement arithmetic and assumes that >> shifts with sign-extension.
Your example doesn't work because you didn't put parenthesis around (x^y)
This is working:
return y? (((x^y) >= 0) ? 1 : -1) : 0;
I think you can't do much faster if you want to return -1, 1 or 0. This is because -1 is 11111111 and is quite different from 0 and 1. A set of bit operations that would return 11111111, 0 or 1 would be complicated and certainly slower than the code above.
EDIT: if instead of -1 and 1 you can cope with any negative or positive number, then you can eliminate a branch
return y ? ((x^y) | 1) : 0;
Edit:
((x*y)>>7) | -(-(x*y)>>7)
Above returns 1 if both are same sign, -1 if both are different signs.
Below returns 1 if both are positive, -1 if both are negative.
Assuming signed 32 bit values. With |x,y|<7 you could shift by 3.
((x&y)>>31) // -1 or 0
-((-x&-y)>>31) // 1 or 0
((x&y)>>31) | -((-x&-y)>>31)
Assuming < is 1 or 0.
-((x&y)<0) // -1 or 0
((-x&-y)<0) // 1 or 0
-((x&y)<0) | ((-x&-y)<0)
Either way looks like 8 operations.
You could do something like this (Only with proper variable names and done much less ugly!) Note that this ONLY works with 2s compliment numbers and if your values are limited to -6 to 6 as in your questions.
Profile it to make sure it's faster than the clear way of doing and ONLY write code like this once you have determined that you can't meet your requirements using a much more obvious approach. with branch prediction etc, branches aren't always slow on x86 for example. I would never write unportable code like this unless I had no choice to meet performance requirements.
Basically extract the sign bits and exclusive or them to get the result you want.
int foo(int x, int y)
{
int s;
if (x == 0 || y == 0) return 0;
x = x >> 4; // Bit 0 of x will be the sign bit of x
y = y >> 4; // Bit 0 of y will be the sign bit of y
s = (x ^ y) & 1; // sign is 0 if they have the same sign, 1 otherwise
return 1 - 2 * s; // Make it 1 for the same sign, -1 otherwise
}
this compiles on my compiler to a couple of quick tests for zero and what looks like quite an efficient bit of bit maniplation after that...
test ecx, ecx
je SHORT $LN1@foo
test edx, edx
je SHORT $LN1@foo
; Line 12
xor ecx, edx
mov eax, 1
sar ecx, 4
and ecx, 1
add ecx, ecx
sub eax, ecx
; Line 13
ret 0
$LN1@foo:
; Line 5
xor eax, eax
; Line 13
ret 0
To express the sign of the number x
as an "normalized" integer (i.e. -1, 0, +1) use
inline int sign(int x) { return (x > 0) - (x < 0); }
Deriving from the above, to compare x
and y
for sign equality use
inline bool same_sign(int x, int y) {
return sign(x) == sign(y);
}
for boolean result.
Or, for -1, 0, +1 result
inline int compare_sign(int x, int y) {
return sign(x) * sign(y);
}
How efficient you final code will be depends, of course, on the quality of the compiler you are using.
If the [-6..+6] constraint does not hold (nor x != 0), a branchless solution is given by (5 ops):
x*= y;
return (x >> 31) - (-x >> 31); // Sign(x * y)
and a solution that works for the full integer range is (9 ops):
return ((x >> 31) - (-x >> 31)) * ((y >> 31) - (-y >> 31)); // Sign(x) * Sign(y)
A mixture of solutions by AndreyT and Loki Astari: branchless, compact, efficient (two 1D array lookups, one multiply):
static int S[13]= { -1, -1, -1, -1, -1, -1, 0, +1, +1, +1, +1, +1, +1 }, * Sign= &S[6];
return Sign[x] * Sign[y];
Alternatively, a less compact more efficient approach (one multiply, one 1D array lookup):
static int S[73]= {
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1,
0,
+1, +1, +1, +1, +1, +1,
+1, +1, +1, +1, +1, +1,
+1, +1, +1, +1, +1, +1,
+1, +1, +1, +1, +1, +1,
+1, +1, +1, +1, +1, +1,
+1, +1, +1, +1, +1, +1 }, * Sign= &S[36];
return Sign[x * y];
精彩评论