I want to extract some specific words from the following string :-
Exported Layer : missing_hello
Comment :
Total Polygons : 20000 (reported 100).
I want to extract the word "missing_hello" and "2000" from the above string and want to display it as
missing_hello : 20000
How to do开发者_StackOverflow that in unix?
Assuming than missing_hello is everytime one word - you can:
perl -lane '$el=$F[3] if(/Exported Layer/); print "$el: $F[3]" if(/Total Polygons/);'
Take a look at this guide- http://www.grymoire.com/Unix/Sed.html
Sed is certainly a tool worth learning. I would look specifically at the sections titled "Using \1 to keep part of the pattern", and "Working with Multiple Lines".
If you have perl, you could use this:
use strict;
use warnings;
my $layer;
my $polys;
while (<>) {
if ($_ =~ m{^Exported \s Layer \s : \s (\S+)}xms) {
$layer = $1;
next;
}
if ($_ =~ m{^Total \s Polygons \s : \s (\d+)}xms) {
$polys = $1;
}
if (defined $layer && defined $polys) {
print "$layer : $polys\n";
$layer = $polys = undef;
}
}
In awk
:
awk -F: '/Exported Layer/ { export_layer = $2 }
/Total Polygons/ { printf("%s : %s\n", export_layer, $2); }' "$@"
If the input is garbage, the output will be too (GIGO). If the fields can contain colons, life gets messier.
In sed
:
sed -n -e '/Exported Layer : *\(.*\)/{s//\1 : /;h;}' \
-e '/Total Polygons : *\(.*\)/{s//\1/;x;G;s/\n//;p;}' "$@"
Colons in fields are not a problem with this sed
version.
Now tested on MacOS X 10.6.7. Both scripts include the commentary after the number in the 'Total Polygons' line. Both scripts can fairly easily be revised to only print the number and ignore the commentary. It would help to have a precise definition of all the format possibilities.
I would probably actually use Perl (or Python) to do this job; the field splitting is just messy enough to benefit from the better facilities in those languages.
精彩评论