I'm implementing a sliding window over a stream of events, in Java. So I want a data structure which allows me to do the following:
add to the end of the data structure when new events occur;
remove from the start of the data structure when old events are processed;
get standard random access (
size()
,get(i)
) to the elements of the data structure; in general, typical List "read" operations;is efficient for all of the above operations;
is unbounded.
No other access is required. And no thread safety is required.
I'm currently 开发者_如何学Cdoing this with an ArrayList, to get things up and running. But I want something more efficient; the remove(0)
method (2. above) is inefficient with an ArrayList
.
Numbers 1. and 2. are standard Queue-style operations. However, the implementations of Queue
in the JDK (such as ArrayDeque) don't allow for get(i)
in 3.
So, I'm wondering if there are any libraries out there which have such an implementation, and are suitable for commercial use.
If not, I guess I'll resort to writing my own...
Seems like a task for a Circular Buffer - as long as it's okay if the queue has a fixed capacity. I don't know of any standard implementation though. But here is a nice recipe to roll your own.
I got this problem and tried solving it by copying the source code of ArrayDeque
and adding something like:
E get(index){ return elements[(head + index) % size];}
If you have a large enough array you can implement a queue with a basic array, and just use an index as to the head of the list, and use the mod operator so you can wrap around if needed.
So, you basically have a circular array that supports the insert and remove functions.
UPDATE:
It is a quick operation to copy an array to a larger array, so just double in size, perhaps, when getting close to the end, and just copy the array, as a step in doing insert. Overall you will still have very fast access, as the norm should not be to increase and copy.
How fast are the events coming in and out of this queue?
On the one hand, you can have a "sufficiently large" circular buffer.
While it is technically "bounded", you can make it "unbounded" by growing it as necessary.
By the same token, you can "shrink" it down in terms of overall capacity when it's "quiet".
But, for many applications a 100, 1000, or even 10000 item capacity circular buffer is effectively unbounded.
The only library I can think of that would implement such an interface would be a LinkedList but frankly I'm not sure what the performance characteristics are.
Just throwing this out there as an alternative to rolling your own, so please take with a grain of salt: Depending on how frequently you need the random access get(i)
and what performance you need from it (and how big your queue size will generally be), you could always use ArrayDeque.toArray()[i]
when you need to access an element. The toArray()
uses System.arraycopy()
under the covers which should be pretty fast for small queue sizes and occasional usage. Would help to understand why you need random access to a queue and how often it is needed -- possibly there's a different way to implement your algorithm without it.
If it truly must be unbounded, then something like ConcurrentSkipListMap may be useful, if you assign an incrementing sequence to each event to use as the key in the map. It provided methods such as pollFirst/LastEntry. If you can sacrifice the unbounded nature of it, then a ring buffer maybe what you need.
A binomial heap can have O(1) amortized insert and O(log n) amortized delete min; I believe that it can also have O(log**2 n) amortized random access. Queue-push would insert the element into the heap, with successive integers as keys.
With a rbtree, you can do queue-push with O(log n) pessimistic for all of insert, delete min and random access. That is because the tree will have a contiguos set of integers as the keys, and the k-th element of the queue will be the element in the tree with the k-th key.
精彩评论