开发者

Solve the problem with O(nlogn) time

开发者 https://www.devze.com 2023-01-18 12:58 出处:网络
Let A[1]<=A[2]<=....<=A[n]. Let X be an arbitrary number. Give an algorithm find all pairs of A[i] and A[j] such that A[j] - A[i] >= X. All numbers are positive integers.

Let A[1]<=A[2]<=....<=A[n]. Let X be an arbitrary number. Give an algorithm find all pairs of A[i] and A[j] such that A[j] - A[i] >= X. All numbers are positive integers.

If you want to see the original problem. Here it is:

Let P = {p1; p2; ; pn} be a set of n p开发者_运维技巧oints in a 2-dimensional space, where pi = (xi; yi) for each i. Let D = (dx; dy). The problem is to decide whether there exists a pair of points pi and pj such that xj - xi >= dx and yj - yi >= dy. You can easily solve this problem in O(n^2) time by considering all possible pairs of points. But we are interested in developing an O(n log n) time algorithm.


Here you can take advantage of the fact that your input is sorted and that all numbers are positive integers. If

A[j] - A[i] ≥ X

then we know the following is also true

A[j + 1] - A[i] ≥ X

So an algorithm might be

for(i = 1; i < n; i++) // for every value (this part is O(n))
{
    int minJ = A[i] + X; // the minimum J that satisfies `A[j] - A[i] >= X`

    int cutoff = binarySearch(minJ); // figure out the minimum J for which  `A[j] - A[i] >= X` (this part is O(log(n))

    storeResults(i, cutoff, n); // In Answers[i], save every qualifying integer (this part is less than O(log(n))
}

In total, you have

O(n * (log(n) + less-than-log(n))
O(n * (2 log(n)))
O(n * log(n))

There's room for some minor optimizations, like only doing the main loop up to n - 1 instead of n, but that's not really critical or relevant to the Big-Oh.


Obviously, max(A[j] - A[i]) is achieved when A[j] is largest and A[i] is smallest: A[n] - A[1]. Thus, you just need to check i = n and j = 1. O(1) :)

edit
If you need to find all such pairs (i, j), then it's obviously O(n^2) task: because there're O(n^2) solutions in general case. So, just go check all pairs.

for (int i = 0; i < n; ++i) {
    for (int j = 0; j < n && A[i] - A[j] >= X; ++j) {
        if (i != j) {
            print("new pair: ", i, j);
        }
    }
}


Since the input is sorted, you could binary search for X - A[i] for each i from 1 to n. Since items can be equal, you'd need to find the upper and lower bounds for the binary search though. I think this would be O(nlogn)...


You're trying to find pairs of numbers A[j] - A[i] >= X.

This should be possible in roughly 2nlogn time:

For each number (order n)
Conduct a binary search (logn) to find the last number that meets the criteria A[j] - A[i]
Print out all the numbers before it (n)

In psuedo code:

for(int i = n; i > 0; i--) // O(n)
{
   int last_number = binarySearch(i); // O(log (n/2)) because on avg we only need to search half the list
   if(last_number != SOME_INVALID_THING)
   {
      for(int x = 0; x < last_number; x++) // O(n)
      {
         printf("%d %d", i, x);
      }
   }
}

Is that correct? Or is this just n^2, it seems like you'll never end up doing much work but the nested loops imply n^2.


I am confused about why this is not an O(n) problem: just loop through the values with an index i and let another index j lag behind so that it is always at or before the value back earlier in the list whose value is at least X less than the current A[i]. For each value A[i], peek back at A[j] and see whether the difference is exactly X, and if so then remember that i,j pair. Then advance j if it needs to be so that A[j+1] is within X of A[i], and return to the top of the loop.

Edit: Nikita Rybak is quite correct that I missed the >= in the problem, but that only makes the problem vastly simpler: instead of the matching j values for a given i being a range with a stop and start, all we need is one boundary, because if, for example, A[6] - A[4] >= X, then obviously the same will hold true for A[6] and everything from A[3] on down.


With a nice data structure it can be done in O(n)

Put A's in a linked list.

Create another linked list with references to A[i] and A[j].

Now we can start.

Ai = Aj = first element of A
while Aj <> null
    if Aj - Ai < X 
    then 
        Aj = Aj.next
    else
        S.ai = Ai
        S.aj = Aj
        Ai = Ai.next
    end 
end

there is a single loop and Aj will advance between 50% and 100% of cases depending on X. Because of the linked lists we avoid an explosion of copies.

==> O(N)

With the addition of the original problem with the 2D problem some modifications are in order, the index must be stored with the value.

there are 2 lists with element xi -> xk and yi -> yl (where xk and yl are the smallest numbers >= xi + dx resp yi + dy.

If you find a couple xi,yi where k - l has a different sign from the previous i's you can walk the xj or yj list to find a point which satisfies the constraint.


I think I've got it, but since this smells like homework, I'll just give a hint. My algorithm uses a list of points which at all times is ordered so that x is increasing and y is decreasing. Points are considered one at a time, and each point falls in one of four cases:

  1. A pair of points satisfying the condition is found.
  2. We can prove that it's safe to ignore the new point without falsely claiming there are no answers.
  3. The new point is inserted into the list, preserving the invariant about its ordering.
  4. The new point replaces one or more points which were previously in the list, and we can prove that the points being removed can safely be ignored.

If all points are examined this way without ever hitting case 1, no pair of points from the input satisfies the condition.


Maybe I was very bad at breaking down the prob, but I got the answer for the original now. Here is my solution:

Step 1: Sort the set P by x-coordinate.

Step 2: Let Q = { q1, q2,...,q_n} where qi = min {y1,y2,..,y_i}

Step 3: If (x_n - x_midpoint >= dx)

      if(y_n - q_midpoint >= dy)

        return there exit;

      else repeat step3 for x_n and the new midpoint of the range after the midpoint

   else

       repeat step 3 for x_n and the new midpoint of the range from the midpoint to the left.

       When there is only one element in the range and the pair hasn't been found, repeat Step 3 for x_n-1, x_n-2,..., x2. 

       If the pair hasn't been found after this, then there exist no such pair.

Sorry everyone if my first question does not make sense!

0

精彩评论

暂无评论...
验证码 换一张
取 消