开发者

Java: why are iterators not copyable

开发者 https://www.devze.com 2023-01-18 10:26 出处:网络
I would think that Iterator.copy() would be quite a handy function. You could implement ite开发者_JAVA技巧rator filters in a much better way.

I would think that Iterator.copy() would be quite a handy function. You could implement ite开发者_JAVA技巧rator filters in a much better way.

For example, the only reason in Googles Java Collection for the filter (and similar) functions to use UnmodifiableIterator (which is just an Iterator without remove) is because you cannot implement such a filter Iterator otherwise without being able to copy it at some point. (Really, that is not possible with the current interface; try yourself.)

Another advantage would be that you could use an iterator in a for-each-loop: Because a copy-able iterator would automatically also be iterable. See also this question. Right now, the main design reason to not allow this is because an Iterator which implements Iterable and Iterator<T> iterator() { return this; } would invalidate the iterator. By having a copy function, it is as simple as Iterator<T> iterator() { return copy(); } and it would not invalidate the original iterator. Thus there is no reason anymore to not allow this.

Is there any reason? Just to make it less complicated to implement it?


Although they usually are, Iterators do not theoretically have to be linked to a collection. The copy method over an input stream, for instance, would be difficult to implement, and would very easily cause obscure memory problems.


An Iterator represents a position in a stream from a source (Iterable in java speak), and there is no guarantee that it is possible to copy or even access the source of the stream.

For example, you could be iterating over bytes as they are streamed from a webserver, in which case it would be impossible to tell the webserver mid-stream to "From this position on, i want you to send me the same bytes twice, but asynchronously as i request them."

There is only the one stream, and it can't be copied.

The fact that most of the Iterators you usually see are over a Collection, is incidental.


The only reason why Google have UnmodifiableIterator is to basically guarantee immutability in their collections. They're making sure that there's no way that you can change the internal state of a collection.

Don't forget that the original idea for an iterator is that it's a pointer to the current element during transveral, and it manages to next/previous transversal (for reverse for doubly-linked iterators) to the element next/previous to it.

There is no actual reason why iterators aren't Cloneable, quite simply, cloning an iterator will still mean having an iterator pointing to the same collection elements (except it now lives in 2 different address space). Unless you want the cloned iterator to point to another collections, there is no point.


I wanted something like this, here is what I've done (based on some work done on Lambdaj).
The main flaw is that this Iterator will basically fill a List with all the supposed content of the Iterator which could be really heavy in memory.

Why did I used a List, because sometimes an Iterator iterates in a specific order, so the "sub-Iterators" must do the same (and the ListIterator really helps me here).

public class IterableIterator<T> implements Iterable<T>, Iterator<T> {
    //The content of the given iterator. Will be filled by its iterators.
    private final List<T> iteratorContent = new ArrayList<T>();
    private final Iterator<T> originalIterator;
    private final Iterator<T> innerIterator;

    public IterableIterator(Iterator<T> originalIterator) {
        this(originalIterator, false);
    }

    public IterableIterator(Iterator<T> originalIterator, boolean cache) {
        if (originalIterator == null) {
            throw new IllegalArgumentException("Parameter can't be null");
        }

        this.originalIterator = originalIterator;
        if (cache) {
            while (originalIterator.hasNext()) {
                iteratorContent.add(originalIterator.next());
            }
        }

        innerIterator = iterator();
    }

    @Override
    public Iterator<T> iterator() {
        return new IteratorIterator();
    }

    @Override
    public boolean hasNext() {
        return innerIterator.hasNext();
    }

    @Override
    public T next() {
        return innerIterator.next();
    }

    @Override
    public void remove() {
        innerIterator.remove();
    }

    private class IteratorIterator implements Iterator<T> {
        private ListIterator<T> innerIterator = iteratorContent.listIterator();

        @Override
        public boolean hasNext() {
            return innerIterator.hasNext() || originalIterator.hasNext();
        }

        @Override
        public T next() {
            if (!innerIterator.hasNext() && originalIterator.hasNext()) {
                T item;
                synchronized (originalIterator) {
                    item = originalIterator.next();
                    iteratorContent.add(item);
                }
                innerIterator = iteratorContent.listIterator(innerIterator.nextIndex());
            }
            if (innerIterator.hasNext()) {
                try {
                    return innerIterator.next();
                } catch (ConcurrentModificationException e) {
                    //Quick and dirty solution if you have a concurrent modification.
                    //It can't happen from the outside, so you can easily suppose that another originalIterator
                    //from this class has been called and had added elements to the list.
                    //Best thing to do, reset the originalIterator to the current position.
                    innerIterator = iteratorContent.listIterator(innerIterator.nextIndex());
                    return innerIterator.next();
                }
            }

            throw new NoSuchElementException();
        }

        @Override
        public void remove() {
            throw new UnsupportedOperationException();
        }
    }
}


As a simplistic example of why you would want to copy an iterator, consider the following code which finds the first pair of matching values in a single array.

for(int i=0;i<size;i++)
{
  x = array[i];

  for(int j=i+1;j<size;j++)
  {
    y = array[j];
    if(x == y)
    {
      doSomething();
      break;
    }
}

Note the "j=i+1". That's where you run into problems with an iterator. Oh well, workarounds are fairly common in Java it seems...


You can always implement your own CopyableIterator that implements Iterator. And then you can do

new CopyableItereator(collection);

The class would be like this

class CopyableIterator implements Iterator{
Iterator iterator;
Collection collection;
int index=0;

public CopyableIterator(Collection collection){
super();
this.collection = collection;
this.iterator = collection.iterator();
}

public CopyableIterator(Collection collection, int index){
super();
this.collection =collection;
this.iterator = collection.iterator();
this.advanceToIndex(iterator,index); //This function just moves the iterator till the index.
this.index=index;
}

//Override the functions of Iterator here returning iterator.function()

@Override
public Object next(){
index++;
return this.iterator.next();
}

public CopyableIterator copy(){
return new CopyableIterator(this.collection,this.index)

}

}

Disclaimer: This is roughly the class. It has not been tested.


What exactly would it mean to copy an Iterator? Do you mean than it should be able to create a new Iterator just like itself except starting at the beginning? That's the responsibility of an Iterable... it doesn't make sense to duplicate that functionality, especially given the stateful nature of iterators... it would just confuse things.

What would you expect to happen if you wrote:

Iterator<Foo> iter = someIterable.iterator();
iter.next();
iter.next();
for (Foo foo : iter) {
  ...
}

Would you expect the for loop to iterate over every item the iterator would return, or every one except the first two? Would you expect the iterator to be empty after the for loop completes?


Is there any reason? Just to make it less complicated to implement it?

It would be simple to design and implement an Iterator wrapper class that supported a copy operation. I'm not sure it would be generally useful though, not least because in the general case it would be an expensive operation. This alone would be sufficient reason for the Java designers to not want to add copy() to the Iterator interface.

FOLLOWUP

This is the kind of thing I'm thinking of:

public class CopyableIterator<T> implements Iterator<T> {
    private Iterator<T> it;
    private List<T> copy = new ArrayList<T>();
    private int pos;
    public CopyableIterator(Iterator<T> it) {
        while (it.hasNext()) {
            copy.append(it.next());
        }
        this.it = copy.iterator();
    }
    public T next() {
        T res = next();
        pos++;
        return res;
    }
    public boolean hasNext() {
        return it.hasNext();
    }
    public Iterator<T> copy() {
        return copy.sublist(pos, copy.size()).iterator();
    }
    public void remove() {
        throw new UnsupportedOperationException();
    }
}

The reasoning is this:

  • If I'm wrapping an opaque Iterator, then the only way I can copy it is by reading it using next() and hasNext() and constructing the copy Iterator from that.

  • But I have to do that before I start using the original iterator.

  • The simple way to do that is to make that copy of the iterator's content before I start using it. (It could possibly be done with a lazy incremental copying, but the implementation could get very complicated ... especially when you consider copying copied iterators.)

The method proposed in the other answer is limited to plain collection iterators. If you have a wrapped iterator, or an iterator from some other source that (for instance) doesn't implement Iterable, then you are toasted.

And even with that precondition, the method above doesn't return a true copy of the iterator. Rather, it returns a new iterator for the underlying collection. That is an important distinction. Unless you actually copy the iterated element references, there's no guarantee that the iterators will return the same sequence. Look at the documented behaviour of the iterators for the Concurrent... collection types.


ILMTitan's and Christoffer Hammarström's imply but don't state concretely that copying the stream may not be possible because it requires that the stream elements have an implementation of a copy function, in order to save the state that a copyable iterator would require. Realize that elements could be mutable (or reference dynamic values) and they may reference other structures and semantics which require a customized copy function.

Thus copyable iterators is not orthogonal to copyable stream elements, so this is why copyable iterators are not possible in general.

Another more obscure reason is that the act of copying has side-effects on memory allocation and deallocation. Even the stream elements' copy function(s) might have other side effects.

Another reason is that some low-level optimizations might not be possible when compiling to assembly language.


Iterators were created to step through all the objects in a collection one at a time, using data backed by said collection.

Iterator<T> is almost always implemented using a private inner class that may use state that is part of the outer class. Thus, you can't really modify an Iterator's behavior without writing your own Collection (or whatever) as well.

Copying the Iterator could cause a host of new problems, such as falling out of sync with the backing collection.


You can't possibly copy an iterator - it's basically meaningless. For some, this is obvious from the Iterator interface, but let's demonstrate it with a concrete example. In fact, let's demonstrate with an example about concrete.

Java: why are iterators not copyable

This is a picture of a concrete iterator over a bar of concrete. Iteration in our case means applying the crow bar to break a piece off of the bar. Now, note that:

  • The bar is not a collection of pieces (though some of it has faults): We're creating pieces as we iterate.
  • The result of an iteration via the iterator (of next()) can never be the result of another iteration of the bar. The result has been removed from it.
  • The iteration may produce a different piece depending on the weather, on the amount of force you apply, or maybe on some kind of thermal noise (think: randomness).
  • The result of an iteration via the iterator (of next()) can never be the result of another iteration of the bar - as the probability space of exact iteration result is continuous, and no specific resulting piece has a non-zero probability measure.

Any of the above should convince you not to try to "copy the iterator", that's silly...

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号