开发者

choose n largest elements in two vector

开发者 https://www.devze.com 2023-01-30 05:50 出处:网络
I have two vectors, each contains n unsorted elements, how can I get n largest elements in these two vectors?

I have two vectors, each contains n unsorted elements, how can I get n largest elements in these two vectors?

my solution is merge two vector into one with 2n elements, and then use std::nth_element algorithm, but I foun开发者_运维问答d that's not quite efficient, so anyone has more efficient solution. Really appreciate.


You may push the elements into priority_queue and then pop n elements out.


Assuming that n is far smaller than N this is quite efficient. Getting minElem is cheap and sorted inserting in L cheaper than sorting of the two vectors if n << N.

L := SortedList()
For Each element in any of the vectors do
{
  minElem := smallest element in L
  if( element >= minElem or if size of L < n)
  {
    add element to L
    if( size of L > n )
    {
      remove smallest element from L
    }
  }
}


vector<T> heap;
heap.reserve(n + 1);

vector<T>::iterator left = leftVec.begin(), right = rightVec.begin();

for (int i = 0; i < n; i++) {
    if (left != leftVec.end()) heap.push_back(*left++);
    else if (right != rightVec.end()) heap.push_back(*right++);
}

if (left == leftVec.end() && right == rightVec.end()) return heap;

make_heap(heap.begin(), heap.end(), greater<T>());

while (left != leftVec.end()) {
    heap.push_back(*left++);
    push_heap(heap.begin(), heap.end(), greater<T>());
    pop_heap(heap.begin(), heap.end(), greater<T>());
    heap.pop_back();
}

/* ... repeat for right ... */

return heap;

Note I use *_heap directly rather than priority_queue because priority_queue does not provide access to its underlying data structure. This is O(N log n), slightly better than the naive O(N log N) method if n << N.


You can do the "n'th element" algorithm conceptually in parallel on the two vectors quite easiely (at least the simple variant that's only linear in the average case).

  1. Pick a pivot.
  2. Partition (std::partition) both vectors by that pivot. You'll have the first vector partitioned by some element with rank i and the second by some element with rank j. I'm assuming descending order here.
  3. If i+j < n, recurse on the right side for the n-i-j greatest elements. If i+j > n, recurse on the left side for the n greatest elements. If you hit i+j==n, stop the recursion.

You basically just need to make sure to partition both vectors by the same pivot in every step. Given a decent pivot selection, this algorithm is linear in the average case (and works in-place).

See also: http://en.wikipedia.org/wiki/Selection_algorithm#Partition-based_general_selection_algorithm

Edit: (hopefully) clarified the algorithm a bit.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号