开发者

Iterating over the content of a text file line by line - is there a best practice? (vs. PMD's AssignmentInOperand)

开发者 https://www.devze.com 2023-02-04 23:12 出处:网络
We have a Java Application that has a few modules that know to read text files. They do it q开发者_开发百科uite simply with a code like this:

We have a Java Application that has a few modules that know to read text files. They do it q开发者_开发百科uite simply with a code like this:

BufferedReader br = new BufferedReader(new FileReader(file));  
String line = null;  
while ((line = br.readLine()) != null)  
{  
   ... // do stuff to file here  
} 

I ran PMD on my project and got the 'AssignmentInOperand' violation on the while (...) line.

Is there a simpler way of doing this loop other than the obvious:

String line = br.readLine();  
while (line != null)  
{  
   ... // do stuff to file here  
   line = br.readLine();  
} 

Is this considered a better practice? (although we "duplicate" the line = br.readLine() code?)


I know is an old post but I just had the same need (almost) and I solve it using a LineIterator from FileUtils in Apache Commons. From their javadoc:

LineIterator it = FileUtils.lineIterator(file, "UTF-8");
try {
    while (it.hasNext()) {
    String line = it.nextLine();
    // do something with line
    }
} finally {
    it.close();
}

Check the documentation: http://commons.apache.org/proper/commons-io/javadocs/api-release/org/apache/commons/io/LineIterator.html


The support for streams and Lambdas in java-8 and Try-With-Resources of java-7 allows you to achive what you want in more compact syntax.

Path path = Paths.get("c:/users/aksel/aksel.txt");

try (Stream<String>  lines = Files.lines(path)) {
    lines.forEachOrdered(line->System.out.println(line));
} catch (IOException e) {
    //error happened
}


I generally prefer the former. I don't generally like side-effects within a comparison, but this particular example is an idiom which is so common and so handy that I don't object to it.

(In C# there's a nicer option: a method to return an IEnumerable<string> which you can iterate over with foreach; that isn't as nice in Java because there's no auto-dispose at the end of an enhanced for loop... and also because you can't throw IOException from the iterator, which means you can't just make one a drop-in replacement for the other.)

To put it another way: the duplicate line issue bothers me more than the assignment-within-operand issue. I'm used to taking in this pattern at a glance - with the duplicate line version I need to stop and check that everything's in the right place. That's probably habit as much as anything else, but I don't think it's a problem.


I routinely use the while((line = br.readLine()) != null) construct... but, recently I came accross this nice alternative:

BufferedReader br = new BufferedReader(new FileReader(file));

for (String line = br.readLine(); line != null; line = br.readLine()) {
   ... // do stuff to file here  
}

This is still duplicating the readLine() call code, but the logic is clear, etc.

The other time I use the while(( ... ) ...) construct is when reading from a stream in to a byte[] array...

byte[] buffer = new byte[size];
InputStream is = .....;
int len = 0;
while ((len = is.read(buffer)) >= 0) {
    ....
}

This can also be transformed in to a for loop with:

byte[] buffer = new byte[size];
InputStream is = .....;
for (int len = is.read(buffer); len >= 0; len = is.read(buffer)) {
    ....
}

I am not sure I really prefer the for-loop alternatives.... but, it will satisfy any PMD tool, and the logic is still clear, etc.


Based on Jon's answer I got to thinking it should be easy enough to create a decorator to act as a file iterator so you can use a foreach loop:

public class BufferedReaderIterator implements Iterable<String> {

    private BufferedReader r;

    public BufferedReaderIterator(BufferedReader r) {
        this.r = r;
    }

    @Override
    public Iterator<String> iterator() {
        return new Iterator<String>() {

            @Override
            public boolean hasNext() {
                try {
                    r.mark(1);
                    if (r.read() < 0) {
                        return false;
                    }
                    r.reset();
                    return true;
                } catch (IOException e) {
                    return false;
                }
            }

            @Override
            public String next() {
                try {
                    return r.readLine();
                } catch (IOException e) {
                    return null;
                }
            }

            @Override
            public void remove() {
                throw new UnsupportedOperationException();
            }

        };
    }

}

Fair warning: this suppresses IOExceptions that might occur during reads and simply stops the reading process. It's unclear that there's a way around this in Java without throwing runtime exceptions as the semantics of the iterator methods are well defined and must be conformed to in order to use the for-each syntax. Also, running multiple iterators here would have some strange behavior; so I'm not sure this is recommended.

I did test this, though, and it does work.

Anyway, you get the benefit of for-each syntax using this as a kind of decorator:

for(String line : new BufferedReaderIterator(br)){
    // do some work
}


Google's Guava Libraries provide an alternative solution using the static method CharStreams.readLines(Readable, LineProcessor<T>) with an implementation of LineProcessor<T> for processing each line.

try (BufferedReader br = new BufferedReader(new FileReader(file))) {
    CharStreams.readLines(br, new MyLineProcessorImpl());
} catch (IOException e) {
    // handling io error ...
}

The body of the while loop is now placed in the LineProcessor<T> implementation.

class MyLineProcessorImpl implements LineProcessor<Object> {

    @Override
    public boolean processLine(String line) throws IOException {
        if (// check if processing should continue) {
            // do sth. with line
            return true;
        } else {
            // stop processing
            return false;
        }
    }

    @Override
    public Object getResult() {
        // return a result based on processed lines if needed
        return new Object();
    }
}


I'm a bit surprised the following alternative was not mentioned:

while( true ) {
    String line = br.readLine();
    if ( line == null ) break;
    ... // do stuff to file here
}

Before Java 8 it was my favorite because of its clarity and not requiring repetition. IMO, break is a better option to expressions with side-effects. It's still a matter of idioms, though.


AssignmentInOperand is a controversial rule in PMD, the reason of this rule is: "this can make code more complicated and harder to read" (please refer http://pmd.sourceforge.net/rules/controversial.html)

You could disable that rule if you really want to do it that way. In my side I prefer the former.

0

精彩评论

暂无评论...
验证码 换一张
取 消