I'm a java programmer. I use bash scripts a lot for text processing.
Utilities like grep,sed,awk,tr,wc,find, along with piping between commands gives such a powerful combination.
However bash programming lacks portability, testability and the more elegant programming constructs that exist in Java. It also开发者_开发问答 makes it harder to integrate into our other Java products.
I was wondering if anyone knows of any Java text processing libraries out there which might offer what I'm looking for.
It would be so cool to be able to write:
Text.createFromFile("blah.txt).grep("-v","ERROR.*").sed("s/ERROR/blah/g").awk("print $1").writeTo("output.txt")
This might be pie-in-in-the-sky stuff. But thought I'd put the question out there anyway.
Unix4j implements some basic unix commands, mainly focussing on text-processing (with support for piping between commands): http://www.unix4j.org
Example (Ben's example, but without awk as this is not currently supported):
Unix4j.fromStrings("1:here is no error", "2:ERRORS everywhere", "3:another ERROR", "4:nothing").toFile("blah.txt");
Unix4j.fromFile("blah.txt").grep(Grep.Options.v, "ERROR.*").sed("s/ERROR/blah/g").toFile("output.txt");
Unix4j.fromFile("output.txt").toStdOut();
>>>
1:here is no error
4:nothing
Note:
- the author of the question is involved in the unix4j project
Believe it or not, but I used embedded Ant for many of those tasks.
Update
Ant has Java api's that allow it to be called from Java projects. This is embedded mode. This is a reference to And Api 1.6.1. Distribution should include docs as well.
To use it, you would create new task object, set appropriate parameters and execute it just as you would in build.xml but via Java Api. Than you can run your task.
Something like
ReplaceRegExp regexp = new ReplaceRegExp();
regexp.setMatch("bla");
regexp.setFile(new File("inputFile"));
regexp.execute();
You may need to set up some other stuff as well.
Not sure if it solves your problem, but Ant has a lot of code to do things. Just search through docs.
精彩评论