开发者

Passing variable to awk and using that in a regular expression

开发者 https://www.devze.com 2022-12-19 09:39 出处:网络
I\'m learning awk and I have trouble passing a variable to the script AND using it as part of a regex search pattern.

I'm learning awk and I have trouble passing a variable to the script AND using it as part of a regex search pattern.

The example is contrived but shows my probem.

My data is the following:

Eddy        Smith       0600000000  1981-07-16    Los Angeles
Frank       Smith       0611111111  1947-04-29    Chicago           
Victoria    McSmith     0687654321  1982-12-16    Los Angeles
Barbara     Smithy      0633244321  1984-06-24    Boston            
Jane        McSmithy    0612345678  1947-01-15    Chicago               
Grace       Jones       0622222222  1985-10-07    Los Angeles
Bernard     Jones       0647658763  1988-01-01    New York          
George      Jonesy      0623428948  1983-01-01    New York          
Indiana     McJones     0698732298  1952-01-01    Miami             
Philip      McJonesy    0644238523  1954-01-01    Miami

I want an awk script that I can pass a variable and then have the awk script do a regex for the variable. I've got this script now called "003_search_persons.awk".

#this awk script looks for a certain name, returns firstName, lastName and City

#print column headers
BEGIN {
    printf "firstName lastName City\n";
}

#look for the name, print firstName, lastName and City
$2 ~ name {
    printf $1 " " $2 " " $5 " " $6;
    printf "\n";
}

I call the script like this:

awk -f 003_search_persons.a开发者_如何转开发wk name=Smith 003_persons.txt

It returns the following, which is good.

firstName lastName City
Eddy Smith Los Angeles
Frank Smith Chicago
Victoria McSmith Los Angeles
Barbara Smithy Boston
Jane McSmithy Chicago

But now I want to look for a certain prefix "Mc". I could ofcourse hardcode this, but I want an awk script that is flexible. I wrote the following in 003_search_persons_prefix.awk.

#this awk script looks for a certain prefix to a name, returns firstName, lastName and City

#print column headers
BEGIN {
    printf "firstName lastName City\n";
}

#look for the prefix, print firstName, lastName and City
/^prefix/{
    printf $1 " " $2 " " $5 " " $6;
    printf "\n";
}

I call the script like this:

awk -f 003_search_persons_prefix.awk prefix=Mc 003_persons.txt

But now it finds no records.

The problem is the search pattern "/^prefix/". I know I can replace that search pattern by a non-regex one, as in the first script, but suppose I want to do it with a regex, because I need the prefix to really be at the start of the lastName field, as it should be, being a prefix and all ;-)

How do I do this?


you can try this

BEGIN{
 printf "firstName lastName City\n";
 split(ARGV[1], n,"=")
 prefix=n[2]
 pat="^"prefix
}
$0 ~ pat{
    print "found: "$0
}

output

$ awk -f  test.awk name=Jane file
firstName lastName City
found: Jane        McSmithy    0612345678  1947-01-15    Chicago

Look at the awk documentation for more. (and read it from start to finish!)


Change your script to:

BEGIN {
    print "firstName", "lastName", "City"
    ORS = "\n\n"
}

$0 ~ "^" prefix {
    print $1, $2, $5, $6
}

and call it as

awk -v prefix="Mc" -f 003_search_persons.awk 003_persons.txt


You should be able to use your original script unchanged - $2 ~ name is already doing a regex search so if you call your script with name=^Mc then it will return names starting with "Mc". Actually this is not a good example, since Mc only appears at the start of the name - if you use name=^Smith then it will find the Smiths but not the McSmiths.


is awk specifically required? I'm sure it's quite possible in awk, but i don't know it, if you just need to get the job done then you can try. not sure exactly what that delimiter is though.

cut -d " " -f1-2,5 file | egrep '^regex'
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号