开发者

How to detect EOF in awk?

开发者 https://www.devze.com 2022-12-10 06:57 出处:网络
Is there a way 开发者_如何转开发to determine whether the current line is the last line of the input stream?The special END pattern will match only after the end of all input. Note that this pattern ca

Is there a way 开发者_如何转开发to determine whether the current line is the last line of the input stream?


The special END pattern will match only after the end of all input. Note that this pattern can't be combined with any other pattern.

More useful is probably the getline pseudo-function which resets $0 to the next line and return 1, or in case of EOF return 0! Which I think is what you want.

For example:

awk '{ if(getline == 0) { print "Found EOF"} }'

If you are only processing one file, this would be equivalent:

awk 'END { print "Found EOF" }'


You've got two options, both kind of messy.

  1. Store a copy of every current line in a temp variable, and then use the END block to process it.
  2. Use the system command to run "wc -l | getline" in the BEGIN block to get the number of lines in the file, and then count up the that value.

You might have to play with #2 a little to get it to run, but it should work. Its been a while since I've done any awk.


These are the only sensible ways to do what you want, in order of best to worst:

awk 'NR==FNR{max++; next} FNR == max { print "Final line:",$0 }' file file

awk -v max="$(wc -l < file)" 'FNR == max { print "Final line:",$0 }' file

awk 'BEGIN{ while ( (getline dummy < ARGV[1]) > 0) max++; close(ARGV[1])} FNR == max { print "Final line:",$0 }' file


gawk implementation has special rule called ENDFILE which will be triggered after processing every file in argument list. This works:

awk '{line=$0} ENDFILE {print line}' files...

more details you can find here>>


Detecting the EOF is not too reliable when multiple files are on the command line. Detecting the start of the file is more reliable.

To do this, the first file is special and we ignore the FNR==1.

After the first file then FNR==1 becomes the end of the previous file. last_filename always has the filename that you are processing.

Do your file processing after the else.

Do your EOF processing inside the else block, AND in the END block.

   gawk 'BEGIN{last_filename="";} \
      FNR==1{if (last_filename==""){last_filename=FILENAME;} \
      else {print "EOF: "last_filename;last_filename=FILENAME;}} \
      END{print "END: "last_filename;}' $*

For multiple file sets, the else block executes at EOF for all but the last file. The last file is executed in the END block.

For single file sets, the else block doesn't get executed, and the END block is executed.


I'm not even sure how to categorize this "solution"

{
    t = lastline
    lastline = $0
    $0 = t
}

/test/ {
    print "line <" $0 "> had a _test_"
}

END {
    # now you have "lastline", it can't be processed with the above statements
    # ...but you can work with it here
}

The cool thing about this hack is that by assigning to $0, all the remaining declarative patterns and actions work, one line delayed. You can't get them to work for the END, even if you put the END on top, but you do have control on the last line and you haven't done anything else to it.


To detect the last line of each file in the argument list the following works nicely:

FNR == 1 || EOF {
  print "last line (" FILENAME "): " $0
}


One easy way is to run the file via an intermediate sed script, that places a 0 on every non last line, and a 1 on the last one.

cat input_file | sed 's/^/0/;$s/0/1/' | awk '{LST=/^1/;$0=substr($0,2)}
... your awk script in which you can use LST to check for the
... last line.'


Hmm the awk END variable tells when you have already reached the EOF. Isn't really much of help to you I guess


you can try this:

awk 'BEGIN{PFNR=1} FNR==PFNR{PFNR++;next} {print FILENAME,PFNR=2} END{print FILENAME}' file1 file2


A portable solution is provided in the gawk user manual, although as mentioned in another answer, gawk itself has BEGINFILE and ENDFILE.

0

精彩评论

暂无评论...
验证码 换一张
取 消