Given an array of unsinged integers(4 octets each), what's the optimal way of finding first element with atleast one '0' bit and it's index from LSB.
e.g: where n = 9
unsinged int uIntArray[] = {
0xffffffff,
0xffffffff,
0xffffffff,
0xffffffff,
0xffffff9f,
0x00000000,
0x00000000,
0x00000000,
0x00000000,
};
Ans:
element's index = 4
bit's index = 4
I could only think of :
int main (void)
{
bool found_f = false;
int n = 9; //our test case value
unsigned int uIntArray[] = {
0xffffffff,
0xffffffff,
0xffffffff,
0xffffffff,
0xffffff8f,
0x00000000,
0x00000000,
0x00000000,
0x00000000,
};
unsigned int uIntBits [32] = {
1, 2, 4, 8, 16, 32, 64, 128,
256, 512, 1024, 2048, 4096, 8192, 16384, 32768,
65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608,
16777216, 33554432, 67108864, 134217728, 268435456, 536870912, 1073741824, 2147483648
};
unsigned int idx, jdx;
int ele_idx = -1;
int bit_i开发者_Go百科dx = -1;
for (idx =0; idx < n; idx ++) {
if (uIntArray[idx] < UINT_MAX) { /* our candidate */
for (jdx =0; jdx < 32; jdx ++) {
if ((uIntBits[jdx] & uIntArray[idx])) {
ele_idx = idx;
bit_idx = jdx;
found_f = true;
break;
}
}
}
if(found_f) {
break;
}
}
fprintf (stderr, "\nEleIdx[%d] BitIdx[%d]\n", ele_idx, bit_idx);
return 0;
}
is there any better way to do it?
The index of the least significant 0
in x
is the index of the least significant 1
in ~x
. In order to find the latter you just need to count the trailing zeros in ~x
. There are quite a few ways to do it, see here http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightLinear
Using the last method (based on DeBruijn sequence), the search would look as
static const unsigned MultiplyDeBruijnBitPosition[32] =
{
0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
for (idx = 0; idx < n; idx ++)
if (uIntArray[idx] < UINT_MAX)
break;
if (idx < n) {
unsigned v = ~uIntArray[idx];
int bit_idx = MultiplyDeBruijnBitPosition[((v & -v) * 0x077CB531u) >> 27];
fprintf(stderr, "\nEleIdx[%d] BitIdx[%d]\n", idx, bit_idx);
}
To find the first element you can observe that if the number didn't have a 0
bit then it must be 0xff..ff
so instead of explicitly checking each bit, you can simply compare it to 0xff..ff
.
To find the least significant bit of that number I believe you will still have to check each bit.
You can make it faster by using larger datatypes. So instead of testing if each int
is 0xffffffff
, you can go with 64-bit integers and test against 0xffffffffffffffff
.
If you're willing to vectorize, you can do 128-bits (SSE) or 256-bits (AVX) at a time.
In all cases, watch your data alignment. If you don't align, it either won't work, or will make it slower.
To take it one final step, you can actually unroll the loop and test multiple words/vectors at a time. This will give you better IPC. Only when you find any zero do you go through the mess of narrowing down which bit it is.
EDIT:
To illustrate this last point, you can do this: (I've omitted the cleanup code in case idx % 4 != 0
)
for (idx =0; idx < n; idx += 4) {
unsigned int test = uIntArray[idx];
test &= uIntArray[idx + 1];
test &= uIntArray[idx + 2];
test &= uIntArray[idx + 3];
if (test < UINT_MAX){
// Find which bit it is.
}
}
Except you can do it on larger datatypes. (like SSE/AVX vectors)
This will make finding the region of the first 0 much faster, but narrowing on the exact bit will be a bit more expensive. So this approach is better if your datasize is large.
The following code seems to work nicely, using the ~x & (x + 1)
test from another (now deleted) answer and extending it. Not sure why the other answer was deleted.
/* Return the position of the first clear bit in the array,
* or -1 if none found.
* arr: array of uint32_t to search
* sz: number of elements in arr
*/
int findClearBit(uint32_t *arr, int sz)
{
int i;
for (i = 0; i < sz; i++) {
if (~arr[i]) {
switch (~arr[i] & (arr[i] + 1)) {
case 1 << 31: return (i * 32) + 31;
case 1 << 30: return (i * 32) + 30;
case 1 << 29: return (i * 32) + 29;
case 1 << 28: return (i * 32) + 28;
case 1 << 27: return (i * 32) + 27;
case 1 << 26: return (i * 32) + 26;
case 1 << 25: return (i * 32) + 25;
case 1 << 24: return (i * 32) + 24;
case 1 << 23: return (i * 32) + 23;
case 1 << 22: return (i * 32) + 22;
case 1 << 21: return (i * 32) + 21;
case 1 << 20: return (i * 32) + 20;
case 1 << 19: return (i * 32) + 19;
case 1 << 18: return (i * 32) + 18;
case 1 << 17: return (i * 32) + 17;
case 1 << 16: return (i * 32) + 16;
case 1 << 15: return (i * 32) + 15;
case 1 << 14: return (i * 32) + 14;
case 1 << 13: return (i * 32) + 13;
case 1 << 12: return (i * 32) + 12;
case 1 << 11: return (i * 32) + 11;
case 1 << 10: return (i * 32) + 10;
case 1 << 9: return (i * 32) + 9;
case 1 << 8: return (i * 32) + 8;
case 1 << 7: return (i * 32) + 7;
case 1 << 6: return (i * 32) + 6;
case 1 << 5: return (i * 32) + 5;
case 1 << 4: return (i * 32) + 4;
case 1 << 3: return (i * 32) + 3;
case 1 << 2: return (i * 32) + 2;
case 1 << 1: return (i * 32) + 1;
case 1: return (i * 32);
default: return -1;
}
}
}
return -1;
}
精彩评论