Is there a known implementation of an indexed linked list?_问答_开发者

My gut tells me there is no good way to achieve this, but, unlike Mr. Stephen Colbert开发者_如何学运维, I'd rather trust a community of developers than my gut...

Is there a known way to efficiently implement a "best of both worlds" list, one that provides random access by index and O(1) insertion/removal like a linked list?

I foresee two possible outcomes: either "No, this is impossible, for the following obvious reasons..." or "Uh, yes, this has been done; see here, here, and here."

I don't believe it will be possible to get O(1) for both insertion and lookup. The minute you add an array (or even fancy, splittable vectors), the insertion becomes O(n).

There are ways to mitigate the damage depending on the expected behavior of your list. If there will be a lot more lookups than insertions/deletions, it may be better to just use vectors (variable-sized arrays) - these are reasonably efficient, not quite like arrays, but better than traversing lists (since these tend to be lists of arrays, it's still technically traversing a list, but each element in the list typically has its size, which makes it more efficient).

If insertions and deletions are more frequent, you can make the index build a lazy one so that it's only done when required. For example, inserts and deletes will only change the linked list portion (and mark the index as dirty) - only when someone tries to use the index will it be rebuilt and marked as clean.

You can even optimize the rebuild by keeping a record of the first dirty entry. This will mean if you only insert or delete in the last half of the list, you don't need to rebuild the entire index when someone wants to use it.

A solution I once implemented was a 2D List. By this, I mean:

        +-----------+    +-----------+    +-----------+
List -> | count = 7 | -> | Count = 4 | -> | Count = 6 | -> null
        +-----------+    +-----------+    +-----------+
              |                |                |
              V                V                V
        +-----------+    +-----------+    +-----------+
        |    [0]    |    |    [7]    |    |   [11]    |
        +-----------+    +-----------+    +-----------+
              |                |                |
              V                V                V
        +-----------+    +-----------+    +-----------+
        |    [1]    |    |    [8]    |    |   [12]    |
        +-----------+    +-----------+    +-----------+
              |                |                |
              :                :                :
              :                :                :
              |                |                |
              V                V                V
        +-----------+    +-----------+    +-----------+
        |    [6]    |    |   [10]    |    |   [16]    |
        +-----------+    +-----------+    +-----------+
              |                |                |
              V                V                V
             null             null             null

While this made both insertion and lookup O(n), the balance was right. In a pure array solution, lookup is O(1) and insertion is O(n). For a pure linked list, insertion is O(1) (once you've found the insertion point, of course, an operation that is itself O(n)) and lookup is O(n).

The 2D list is O(n) for both but with a lower factor. If you're looking to insert, you can find the right column simply by examining the first row of each column. Then you traverse the column itself looking for the right row. Then the item is inserted and the count for that column is increased. Similarly for deletions although in that case the count is decreased, and the entire column is removed when its count reaches zero.

For an index lookup, you traverse the columns to find the correct column, then traverse the items in the column to get the right item.

And, it can even be auto-adjusting by trying to keep the maximum height and width roughly the same.

If you think that O(log N) == O(1),
check out:

Red-black tree, O(logN) on every operation
Skip list, O(logN) on every operation

When I was implementing a linked list in class I thought about optimizing access time by storing 3 additional fields: The node in the middle of the list, the index of the most recently accessed node and the most recently accessed node itself.

To retrieve a node by index I would then first look at all available paths to reach the node at the given index and then chose the cheapest way to do it. The ways would simply be:

Going from the start to the desired index
Going from the most recently accessed node to the desired index (forwards)
Going from the most recently accessed node to the desired index (backwards)
Going from the middle node to the desired index (forwards)
Going from the middle node to the desired index (backwards)
Going from the end of the node to the desired index

The path with the smallest difference in our desired index and our starting index would be the cheapest option. If no node has been accessed yet, the recently accessed node could be set to be the middle node. Of course, with an even number of elements there is no actual middle, so I'd just choose the floor of n/2.

Anyway, I never got around to actually implement this optimization or even really analyze it, but I hope I could help.

Your gut is correct on this.

Linked lists are O(1) insertion/deletion because the operation you perform to insert or remove something is just switching a couple of pointers (one or two on the object you're inserting, and one or two on one or two other objects). This doesn't change by the size of the list, obviously.

A skip list will give you O(logn) lookup, but since you're maintaining an index it also means O(logn) insertion/deletion, because that index needs to be kept up to date.

Any parallel data structure you use for lookup will need to be maintained, so your complexity will scale by that index's complexity.

Do you have a particular problem in mind you're trying to solve?

For instance, you can get O(n) insertion, removal and lookup if you can guarantee a perfect hash. But you need to know some things about your data ahead of time for that to work.

How about a hash table? You get O(1) random access by key and O(1) insertion/deletion. The catch is that entries are unordered.

For an efficient implementation of ordered sequences, check out finger trees. They give you O(1) access to head and last and O(log n) random access to inner nodes. Insert or delete at either end in O(1). Notably, reversal of a finger tree takes constant time.

I don't know the exact BigO on insertion (as that would vary based upon sample size and growth) but Java's java.util.LinkedList would come immediately to mind.

http://java.sun.com/j2se/1.5.0/docs/api/java/util/LinkedList.html

EDIT: yes, apparently underneath it is still a true linked-list and as such indexed gets could be O(n/2) which is of course is formally O(n).

You could always waste a whole bunch of space and implement a List implementation that keeps a parallel linked-list and array with deferred insert/removals.

While I dont think you can get integer indexing, a backing hashtable might work if you are using 'reference' types.

Java's LinkedList has O(n) access for indexed gets. LinkedList extends AbstractSequentialList to show that it does not offer O(1) indexed gets.

I'd suggest taking a look at Apache's TreeList. It offers O(log n) inserts/removals and O(1) indexed lookups.