In Hadoop 'grep' example (tha开发者_开发问答t comes with the Hadoop package) what is the group parameter.Can you give me an example for that.
Disclaimer: I haven't run this example and am pulling answering after just looking at http://wiki.apache.org/hadoop/Grep
The CLI call is: bin/hadoop org.apache.hadoop.examples.Grep <indir> <outdir> <regex> [<group>]
and you want to know about <group>
.
I suspect this is grouping in regex. (random link - http://www.exampledepot.com/egs/java.util.regex/Group.html)
As noted on the Hadoop Grep link
The command works different than the Unix grep call: it doesn't display the complete matching line, but only the matching string
What I take from this is that if you specify the <group>
value (a number) it will output only the value for that group.
For an example (pulling from the Group link)
input: aba
regex: (a(b)*)+
group 0: aba
group 1: a
group 2: b
If the value for <group>
is 1
then the result would be a
. Group 0 is the full match, not the original string, In this case it just happens to be the same.
hth
精彩评论