开发者

Thread-safe iteration over a collection

开发者 https://www.devze.com 2023-01-31 18:30 出处:网络
We all know when using Collections.synchronizedXXX (e.g. synchronizedSet()) we get a synchronized \"view\" of the underlying collection.

We all know when using Collections.synchronizedXXX (e.g. synchronizedSet()) we get a synchronized "view" of the underlying collection.

However, the document of these wrapper generation methods states that we have to explicitly synchronize on the collection when iterating of the collections using an iterator.

Which option do you choose to solve this problem?

I can only see the following approaches:

  1. Do it as the documentation states: synchronize on the 开发者_JS百科collection
  2. Clone the collection before calling iterator()
  3. Use a collection which iterator is thread-safe (I am only aware of CopyOnWriteArrayList/Set)

And as a bonus question: when using a synchronized view - is the use of foreach/Iterable thread-safe?


You've already answered your bonus question really: no, using an enhanced for loop isn't safe - because it uses an iterator.

As for which is the most appropriate approach - it really depends on how your context:

  • Are writes very infrequent? If so, CopyOnWriteArrayList may be most appropriate.
  • Is the collection reasonably small, and the iteration quick? (i.e. you're not doing much work in the loop) If so, synchronizing may well be fine - especially if this doesn't happen too often (i.e. you won't have much contention for the collection).
  • If you're doing a lot of work and don't want to block other threads working at the same time, the hit of cloning the collection may well be acceptable.


Depends on your access model. If you have low concurrency and frequent writes, 1 will have the best performance. If you have high concurrency with and infrequent writes, 3 will have the best performance. Option 2 is going to perform badly in almost all cases.

foreach calls iterator(), so exactly the same things apply.


You could use one of the newer collections added in Java 5.0 which support concurrent access while iterating. Another approach is to take a copy using toArray which is thread safe (during the copy).

Collection<String> words = ...
// enhanced for loop over an array.
for(String word: words.toArray(new String[0])) {

}


I might be totally off with your requirements, but if you are not aware of them, check out google-collections with "Favor immutability" in mind.


I suggest dropping Collections.synchronizedXXX and handle all locking uniformly in the client code. The basic collections don't support the sort of compound operations useful in threaded code, and even if you use java.util.concurrent.* the code is more difficult. I suggest keeping as much code as possible thread-agnostic. Keep difficult and error-prone thread-safe (if we are very lucky) code to a minimum.


All three of your options will work. Choosing the right one for your situation will depend on what your situation is.

CopyOnWriteArrayList will work if you want a list implementation and you don't mind the underlying storage being copied every time you write. This is pretty good for performance as long as you don't have very big collections.

ConcurrentHashMap or "ConcurrentHashSet" (using Collections.newSetFromMap) will work if you need a Map or Set interface, obviously you don't get random access this way. One great! thing about these two is that they will work well with large data sets - when mutated they just copy little bits of the underlying data storage.


It does depend on the result one needs to achieve cloning/copying/toArray(), new ArrayList(..) and the likes obtain a snapshot and does not lock the the collection. Using synchronized(collection) and iteration through ensure by the end of the iteration would be no modification, i.e. effectively locking it.

side note:(toArray() is usually preferred with some exceptions when internally it needs to create a temporary ArrayList). Also please note, anything but toArray() should be wrapped in synchronized(collection) as well, provided using Collections.synchronizedXXX.


This Question is rather old (sorry, i am a bit late..) but i still want to add my Answer.

I would choose your second choice (i.e. Clone the collection before calling iterator()) but with a major twist.

Asuming, you want to iterate using iterator, you do not have to coppy the Collection before calling .iterator() and sort of negating (i am using the term "negating" loosely) the idea of the iterator pattern, but you could write a "ThreadSafeIterator".

It would work on the same premise, coppying the Collection, but without letting the iterating class know, that you did just that. Such an Iterator might look like this:

class ThreadSafeIterator<T> implements Iterator<T> {
    private final Queue<T> clients;
    private T currentElement;
    private final Collection<T> source;

    AsynchronousIterator(final Collection<T> collection) {
        clients = new LinkedList<>(collection);
        this.source = collection;
    }

    @Override
    public boolean hasNext() {
        return clients.peek() != null;
    }

    @Override
    public T next() {
        currentElement = clients.poll();
        return currentElement;
    }

    @Override
    public void remove() {
        synchronized(source) {
            source.remove(currentElement);
        }
    }
}

Taking this a Step furhter, you might use the Semaphore Class to ensure thread-safety or something. But take the remove method with a grain of salt.

The point is, by using such an Iterator, no one, neither the iterating nor the iterated Class (is that a real word) has to worrie about Thread safety.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号