开发者

Effectively to find the median value of a random sequence

开发者 https://www.devze.com 2023-02-25 03:09 出处:网络
Numbers are randomly generated and passed to a method. Write a program to find and maintain the median value as new values are generated.

Numbers are randomly generated and passed to a method. Write a program to find and maintain the median value as new values are generated.

The heap sizes can be equal or the below heap has one extra.

private Comparator<Integer> maxHeapComparator, minHeapComparator;
private PriorityQueue<Integer> maxHeap, minHeap;

public void addNewNumber(int randomNumber) {
  if (maxHeap.size() == minHeap.size()) {
    if ((minHeap.peek() != null) && randomNumber > minHeap.peek()) {
      maxHeap.offer(minHeap.poll());
      minHeap.offer(randomNumber);
    } else {
      maxHeap.offer(randomNumber);
    }
  }
  else {  // why the following block is correct? 
    // I think it may create unbalanced heap size
    if(randomNumber < maxHeap.peek()) {
      minHeap.offer(maxHeap.poll());
      maxHeap.offer(randomNumber);
    }
    else {
      minHeap.offer(randomNumber);
    }
  }
}

public static double getMedian() {
  if (maxHeap.isEmpty()) return minHeap.peek();
  else if (minHeap.isEmpty()) return maxHeap.peek();

  if (maxHeap.size() == minHeap.size()) {
    return (minHeap.peek() + maxHeap.peek()) / 2;
  } else if (maxHeap.size() > minHeap.size()) {
    return maxHeap.peek();
  } else {
    return minHeap.peek();
  }
}

Assume the solution is correct, then I don't understand why the code block(see my comments) can maintain the heap size balance. In other words, the size difference of two heaps is 0 or 1.

Let us see an example, given a sequence 1, 2, 3, 4, 5
The first random number is **1**
    max-heap: 1
    min-heap:

The second random number is **2**
    max-heap: 1
    min-heap: 2

The third random number is **3**
    max-heap: 1 2
    min-heap: 3 4

The fou开发者_开发百科rth random number is **4**
    max-heap: 1 2 3
    min-heap: 4 5

Thank you


After running it through given sequence,

max-heap : 1, 2, 3
min-heap : 4, 5

since max-heap size is > min-heap it returns 3 as the median.

max-heap stores left half of elements approximately and min-heap stores right-half of sequence approximately.

this code biased towards left-half that is max-heap.

I don't see why this code is incorrect.

0

精彩评论

暂无评论...
验证码 换一张
取 消