How expensive is a call to java.util.HashMap.keySet()?_问答_开发者

I implemented a sparse matrix as List<开发者_C百科Map<Integer,Double>>.

To get all entries of row i I call list.get(i).keySet(). How expensive is this call?

I also used the trove library for an alternative implementation as List<TIntDoubleHashMap>.

What's the cost of calling list.get(i).keys(), here?

Do you have any further ideas of how to implement an efficient sparse matrix?

Or can you provide a list of existing implementations in java?

Depends on the class implementing List and Map. If you are using a List class implementing java.util.RandomAccess (ie. ArrayList), then a call to get(i) is O(1). If it is a LinkedList, it will be O(n).

-- Edited to show the following code snippet (since verdy_p below doesn't read well, and likes to go off the tangent): --

// In HashMap.java, line 867, JDK 1.6.0.24, how much more
// constant time do we want?

public Set<K> keySet() {
    Set<K> ks = keySet;
    return (ks != null ? ks : (keySet = new KeySet()));
}

-- end of edit --

A call to keySet() on most Map implementations will be constant time.

Regarding traversing the keySet() If you are using an array-backed Map implementation (like HashMap), keySet() relies on entrySet() which returns an internal iterator backed by an array. So iteration of keySet() is O(n).

I would also assume that is the case for most (if not all) Map implementations that are backed by arrays.

For SortedMap implementations (like TreeMap), iterating on its keys will be akin to iterating on a tree from lowest to greatest key. This is equivalent to a failed binary search which is O(n).

Both cases appear to be O(n). If you use Eclipse, you can actually look at the code implementing the java classes and get a better idea of their complexity.

For classes under java.util.concurrent (like ConcurrentHashMap), you'll have to take other considerations to determine how expensive they are.

To expand a bit more, if you use a linked list, list.get(i).keyset() will be O(n). With ArrayList, it will be O(1). Traversing the keyset will depend on whether you use an array-backed Map (HashMap) or a SortedMap (TreeMap). In both cases, traversing will be O(n) with the former being significantly faster than the later since array traversal will always be faster than traversing through pointers (or references in this Java specific case.)

Now, if you take both list.get(i).keySet() and the iteration of the set into account, with a linked list implementation, that will be O(n^2). So instead of doing list.get(i).keySet(), you should use an iterator (see pseudocode below, it obviates generic syntax for clarity)

This is O(n^2) for lists that do not implement java.util.RandomAccess (like LinkedList):

for( int i = 0; i < list.size(); i++ )
{
   Set keySet = list.get(i).keySet();
   for( Integer key : keySet.iterator() )
   {
      ... stuff (assuming constant time) ...
   }
}

This is O(n) for that same type of List implementations:

for( Map m : list.iterator() )
{
   for( Integer key : m.keySet() )
   {
      ... stuff (assuming constant time) ...
   }
}

It's as cheap as it gets, since it's a view.

From the jdk7 source line 884:

public Set<K> keySet() {
    Set<K> ks = keySet;
    return (ks != null ? ks : (keySet = new KeySet()));
}

Trove is probably faster since unlike the Java Collection Frameworks it can work directly with primitives without expensive boxing/unboxing.

According to Sparse matrices / arrays in Java, the Colt library includes this functionality; diving into their Javadoc API, this seems to be true, and times are included.

Additionally, your implementation seems to use no column-wise sparsity (you only have hashmaps on the rows). Theirs does, and is optimized for ints and doubles, as is the case in Trove (but not in the standard Java case, which uses objects with considerable overhead). I recommend Colt.