How to partition bits in a bit array with less than linear time_问答_开发者

This is an interview question I faced recently.

Given an array of 1 and 0, find a way to partition the bits in place so that 0's are grouped together, and 1's are grouped together. It does not matter whether 1's are ahead of 0's or 0's are ahead of 1's.

An example input is 101010101, and output is either 111110000 or 000011111.

Solve the problem in less than linear time.

Make the problem simpler. The input is an integer array, with each element either 1 or 0. Output is the same integer array with integers partitioned well.

To me, this is an easy question if it can be solved in O(N). My approach is to use two pointers, starting from both ends of the array. Increases and decreases each pointer; if it does not point to the correct integer, swap the two.

    int * start = array;
    int * end = array + length - 1;

    while (start < end) {
        // Assume 0 always at the end
        if (*end == 0) {
            --end; 
            continue;
        }

        开发者_开发知识库// Assume 1 always at the beginning
        if (*start == 1) {
            ++start; 
            continue;
        }

        swap(*start, *end);
    }

However, the interview insists there is a sub-linear solution. This makes me thinking hard but still not get an answer.

Can anyone help on this interview question?

UPDATE: Seeing replies in SO stating that the problem cannot be solved in sub-linear time, I can confirm my original idea that there cannot be a solution of sub-linear.

Is it possible the interviewer plays a trick?

I don't see how there can be a solution faster than linear time.

Imagine a bit array that is all 1's. Any solution will require examining every bit in this array before declaring that it is already partitioned. Examining every bit takes linear time.

It's not possible. Doing it in less than linear time implies that you don't look at every array element (like a binary search). However since there is no way to know what any element of the array is without looking at it, you must look at each array element at least once.

You can use lookup tables to make it faster, but O(n/8) is still O(n), so either the interviewer was wrong or you misunderstood the question.

It is possible faster then in linear time given you have enough memory, it can be done in O(1)

Use the bitmask as index in a vector which maps to the partitioned bitmask.

using your example, at index 341 (101010101) the value 496 (111110000) is stored.

Perhaps the confusion comes from "less than linear time". For example, this solution counts the number of bits, that makes a masks containing that many bits. It only counts bits while there are uncounted on-bits:

// from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
unsigned count_bits(unsigned pX)
{
    unsigned result;
    for (result = 0; v; ++result)
    {
        pX &= pX - 1;
    }

    return result;
}

unsigned n = /* the number */;

// r contains 000...111, with number of 1's equal to number of 1's in v
unsigned r = 1 << count_bits(n);

Even though this minimizes the number of bits to count, it's still linear. So if this is what is meant by "sub-linear", there you go.

But if they really meant sub-linear as in logarithmic or constant, I don't see a way. You could conceivably make a look-up table for every value, but :/

Technically you could send each element of the array to a separate processor and then do it in less than linear time. If you have N processors, you could even do it in O(1) time!

As others said, I don't believe this can be done in less than linear time. For linear time solution, you can STL algorithms instead your own loop like this:

int a1[8] = {1,0,1,0,1,0,1,0};
std::fill(std::remove(a1, a1+8, 0), a1+8, 0);

Well.. It can be be done 'less than linear' time (cheeky method).

if(n % 2)
{
   // Arrange all 1's to the right and DON'T check the right-most bit, because it's 1
}else{
   // Arrange all 0's to the right and DON'T check the right-most bit, because it's 0.
}

So, technically you 'group' the bits in less than linear time :P

To me, the most likely interpretations are:

The bits are supposed to be in an int instead of an array, in which case you can use something like http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan or an 8-bit (or more) lookup table.
they used "sublinear" to mean "less than n operations" rather than less-than-O(n). But even that seems impossible for the same reasons listed below.
There is another miscommunication in the question
Otherwise the question is wrong, since all elements of the array must be examined to determine the answer, and that is at least 'n' operations.

Listing either 0s or 1s first, and the references to bits rather than bools make me think something like the first option was intended, even though, when dealing with only one word, it doesn't make very much difference. I'm curious to know what the interviewer actually had in mind.

Splitting this work among parallel processors costs N/M ( or O(N) ) only if you assume that parallelism increases more slowly than problem size does. For the last ten years or so, paralellism (via the GPU) has been increasing more rapidly than typical problem sizes, and this trend looks to continue for years to come. For a broad class of problems, it is instructive to assume "infinite parallelism" or more precisely, "parallelism greater than any expected problem size" because the march of progress in GPUs and cloud computing provides such a thing over time.

Assuming infinite parallelism, this problem can be solved in O(logN) time because the addition operator required to add up all the 0 and 1 bits is associative, and so it requires at least logN time steps to complete.