I've got multiple access logs in the logs directory, following the naming convention below:
access.log.1284642120
access.log.1284687600
access.log.1284843260
Basically, the logs are "rotated" by Apache per day, so they can be sorted in order.
I am trying to "read them one after another", so that they can be treated as one log file.
my @logs = glob('logs/access.log.*');
The above code will glob all the logs, but I am not sure:
- In which order will the logs be organized, alphabetically?
- if I want to check "the latest access time from an unique IP", how could I do this?
I have a Perl script that can read a single access log and check this easily (my algorithm is to have a big hash which uses IP address as the key and the access time as the value, and just keep pushing key/value pairs to it...). But I don't want to just merge all access files into one temporary file just for this process.
Any suggestions? Many thanks in advan开发者_运维问答ce.
If you want to ensure a particular order, sort it yourself, even if just to assure yourself that it will come out right:
my @files = sort { ... } glob( ... );
In this case, where the filenames are all the same except for the particular digits, you might not need the sort block:
my @files = sort glob( ... );
To read them as one über-file, I like to use a local @ARGV
so I can use the diamond operator, which is really just the magic ARGV
filehandle. When it gets to the end of one file in @ARGV
, it moves on to the next. This fakes specifying all the files on the command line by assigning to @ARGV
inside the program:
{
local @ARGV = sort { ... } glob( ... );
while( <> ) {
...;
}
}
If you need to know the file you are currently processing, look in $ARGV
.
If you need something more fancy, you might have to resort to brute force.
In a Unix-y environment, you can leverage the shell to group your files together:
my @files = glob("$dir/access.log.*");
open my $one_big_logfile, "-|", "cat @files" or die ...;
while (<$one_big_logfile>) {
...
}
精彩评论