I have a problem where I need to store changing data values v_i (integers) for constant keys i (also integers, in some range say [1;M]). I need to be able to draw quickly a random element weighted by the values v_i, i.e. the probability to draw key k should be v_k/(sum(i=1...M) v_i)
The best idea I could come up with is using a binary tree and storing the partial sum over the values in the subtree rooted at k as the value for key k (still in the range [1;M]). Then, whenever a value changes, I need to update its node and all parent nodes in the tree (takes O(log M) time since the keys are fixed and thus the binary tree is perfectly balanced). Drawing a random element as above also takes O(log M) time (for each level of the tree, one compares the random number say in the range (0,1) against the relative weights of the left subtree, right subtree, and the node itself) and is much faster than the naive algorithm (take random number r, iterate through the elements to find the least k so that sum(i=1...k) < r, sum(i=1...k+1) > r; takes O(M) time).
The question I now have is how to optimize the placement of the tree nodes in memory in order to minimize cache misses. Since all keys are known and remain 开发者_运维问答constant, this is essentially the order in which I should allocate memory for the tree nodes.
Thanks!!
I don't think there is an optimal filling order of a binary tree except something like a pre-order, post-order, in-order filling? Doesn't your question isn't asking how a cache in general can work? Unfortunately I don't know it myself, maybe a more simplier hash-array would be more efficient in your case?
精彩评论