Linux uses COW to keep memory usage low after a fork, but the way Perl 5 variables work in perl
seems to defeat this optimization. For instance, for the variable:
my $s = "1";
perl
is really storing:
SV = PV(0x100801068) at 0x1008272e8
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x100201d50 "1"\0
CUR = 1
LEN = 16
When you use that string in a numeric context, it modifies the C struct
representing the data:
SV = PVIV(0x100821610) at 0x1008272e8
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 1
PV = 0x100201d50 "1"\0
CUR = 1
LEN = 16
The string pointer itself did not cha开发者_如何学编程nge (it is still 0x100201d50
), but now it is in a different C struct
(a PVIV
instead of a PV
). I did not modify the value at all, but suddenly I am paying a COW cost. Is there any way to lock the perl
representation of a Perl 5 variable so that this time saving (perl
doesn't have to convert "0"
to 0
a second time) hack doesn't hurt my memory usage?
Note, the representations above were generated from this code:
perl -MDevel::Peek -e '$s = "1"; Dump $s; $s + 0; Dump $s'
The only solution I have found so far, is to make sure I force perl
to do all of the conversions I expect in the parent process. And you can see from the code below, even that only helps a little.
Results:
Useless use of addition (+) in void context at z.pl line 34.
Useless use of addition (+) in void context at z.pl line 45.
Useless use of addition (+) in void context at z.pl line 51.
before eating memory
used memory: 71
after eating memory
used memory: 119
after 100 forks that don't reference variable
used memory: 144
after children are reaped
used memory: 93
after 100 forks that touch the variables metadata
used memory: 707
after children are reaped
used memory: 93
after parent has updated the metadata
used memory: 109
after 100 forks that touch the variables metadata
used memory: 443
after children are reaped
used memory: 109
Code:
#!/usr/bin/perl
use strict;
use warnings;
use Parallel::ForkManager;
sub print_mem {
print @_, "used memory: ", `free -m` =~ m{cache:\s+([0-9]+)}s, "\n";
}
print_mem("before eating memory\n");
my @big = ("1") x (1_024 * 1024);
my $pm = Parallel::ForkManager->new(100);
print_mem("after eating memory\n");
for (1 .. 100) {
next if $pm->start;
sleep 2;
$pm->finish;
}
print_mem("after 100 forks that don't reference variable\n");
$pm->wait_all_children;
print_mem("after children are reaped\n");
for (1 .. 100) {
next if $pm->start;
$_ + 0 for @big; #force an update to the metadata
sleep 2;
$pm->finish;
}
print_mem("after 100 forks that touch the variables metadata\n");
$pm->wait_all_children;
print_mem("after children are reaped\n");
$_ + 0 for @big; #force an update to the metadata
print_mem("after parent has updated the metadata\n");
for (1 .. 100) {
next if $pm->start;
$_ + 0 for @big; #force an update to the metadata
sleep 2;
$pm->finish;
}
print_mem("after 100 forks that touch the variables metadata\n");
$pm->wait_all_children;
print_mem("after children are reaped\n");
Anyway if you avoid COW on start and during run you should not forgot END phase of lifetime. In shutdown there are two GC phases when in first there are ref counts updates so it can kill you in nice way. You can in solve it ugly:
END { kill 9, $$ }
This goes without saying, but COW doesn't happen on a per-struct basis, but on a memory page basis. So it's enough that one thing in an entire memory page be modified like this for you to pay the copying cost.
On Linux you can query the page size like this:
getconf PAGESIZE
On my system that's 4096 bytes. You can fit a lot of Perl scalar structs in that space. If one of those things gets modified Linux will have to copy the entire thing.
This is why using memory arenas is a good idea in general. You should separate your mutable and immutable data so that you won't have to pay COW costs for immutable data just because it happened to reside in the same memory page as mutable data.
精彩评论