开发者

A specific data structure

开发者 https://www.devze.com 2023-02-01 19:21 出处:网络
Well, this question is a bit specific but I think there is some general idea in it that I can\'t get it.

Well, this question is a bit specific but I think there is some general idea in it that I can't get it.

Lets say I got K servers (which is a constant that I know its size). I have a program that get requests and every request has an id and server id that will handle it. I have n requests - unknown size and can be any number.

I need a data structure to support the next operations within the given complexity:

GetS开发者_如何学Pythonerver - the function gets the request ID and returns the server id that is supposed to handle this request at the current situation and not necessarily the original server (see below).

Complexity: O(log K) at average.

KillServer - the function gets as input a server id that should be removed and another server id that all the requests of the removed server should be passed to.

Complexity: O(1) at the worst case.

--

Place complexity for all the structure is O(K+n)

--

The KillServer function made me think using a Union-Find as I can do the union in O(1) as requested but my problem is the first operation. Why it's LogK? Actually, no matter how I "save" the requests if I want to access to any request (lets say it's an AVL tree) so the complexity will be O(log n) at the worst case and said that I can't assume K>n (and probably K

Tried thinking about it a couple of hours but I can't find any solution. Known structures that can be used are: B+ tree, AVL tree, skip list, hash table, Union-Find, rank tree and of course all the basics like arrays and such.


Consider a data structure consisting of two parts:

  • a finite map (AVL tree, etc.) from client IDs to job IDs
  • a union-find data structure to group job IDs and to label each group with a server ID

Initially, the job IDs are equal to the server IDs. As servers are removed, servers will handle multiple jobs.

In other words, you just add a small indirection client -> job -> server so that your operations touch only the first or only the second arrow.


Union-find starting with requests partitioned by server should work. You start with each request pointing to it's server, then you union together the servers (not the requests). Thus, the amortized cost of GetServer would be O(alpha(K)) rather than O(alpha(n)), where O(alpha(K)) << O(log K).


You could try using a hash table (where each entry of the hash table is a reference to a list):

Let the size of the hash table be the smallest prime number (p) which is greater than K.

The hash function for a requestID, then, would be requestID % p. This would make the complexity for GetServer O(1). However, the downside of this is that there is no balancing of requestIDs.

0

精彩评论

暂无评论...
验证码 换一张
取 消