suppose $my_ref = \$hash{'mary'};
#my_ref
is a reference point to a hash element.
$my_ref
to retrieve the key of the hash element it point to? i.e how to get string 'mary' from $my_ref
?
I ask this question because I have several groups of user name list, some user names appear in multiple groups which consumes memory. So I decide to create a common user name list, and let these groups only store the reference to the corresponding user name rather than user name.
e.g. originally,
%group1 = {'mary'=>1, 'luke'=1,'tom'=1,...}
%group2 = {'mary'=>1, 'sam'=1,'tom'=1,...}
Here you see 'mary' and 'tom' are shown in both group1
and group2
which consume memory. (note I do not care the value in this example, the value is here only because the data struct is a hash). So to reduce memory, I want to have a common list stores all user names:
%common_hash = {'mary'=>1, 'luke'=1,'tom'=1,'sam'=1...};
$ref1 = \$common_hash{'mary'};
$ref2 = \$common_hash{'luke'};
$ref3 = \$common_hash{'tom'};
$ref4 = \$common_hash{'sam'};
groups only store the reference of the hash element:
%group1 = {$ref1=>1, $ref2=1,$ref3=1,...};
%group2 = {$ref1=>1, $ref4=1,$ref3=1,...};
I think this approach can save much memory because:
- one user name is store in memory once not multiple times;
- groups stores reference (an integer) rather t开发者_StackOverflow中文版han string (in my case, the length of each user name is 30 bytes in average, while each integer is only 4 bytes (32 bit sys.) or 8 bytes (64 bit sys.)) (BTW, correct me if an integer does not use 4 bytes or 8 bytes.)
- using reference I can access user name immediately without looking for it.
But how can I get the user name from a group?
If I use @my_ref = keys %group1
, I think I will get value of 'mary',but not 'mary'.
$result = $($my_ref[0]);
A reference is not an integer; it's an SV, so it's going to be something like 24 bytes, not 4.
Not that it matters, because you're not storing references, because hash keys are always strings. The keys of your
%group1
etc. hashes are actually strings that look like "HASH(0x19838e2)", which is useless.Not that it matters, because Perl is smart enough to avoid wasting memory if the same strings are used as keys in multiple hashes. That's right, if you just did things the simple, obvious, sensible way, perl would use less memory than it does with the complicated thing you're trying to do.
Sorry, hashes don't work that way. You aren't saving any memory by using a reference instead of a string as a hash key, and furthermore you are:
- making it harder to find data in the hash (it is obscured)
- getting in the way of Perl's internal hash optimizations (using a hash algorithm to provide O(1) lookup inside what is effectively a list).
In either case, the hash key is a scalar, which needs to be stored somewhere. By using a reference as the hash key, now you not only need to store the reference in the hash, but also the value it is referencing, so you are now using more memory.
What led you to believe that you were saving memory by your, cough, novel approach? Have you run a memory profiler against different implementations?
Generally, you cannot get from a hash's value back to its key (although you could traverse the hash table linearly looking for it, if it were unique). If you want to keep track of both a hash key and value, you need to do it yourself. Some common approaches are:
# iterate through the table by key
foreach my $key (keys %hash)
{
# here we have both the key and its corresponding value
print "value at key $key is $hash{$key}\n";
}
# iterate through the table by keys and values
while (my ($key, $value) = each %hash)
{
print "value at key $key is $value, which is the same as $hash{$key}\n";
}
Please read up on how hashes work in the manual. You can also read about the keys and each functions.
A hash is a means of associating names with scalars. If you have a hash and a key, you have a scalar, not a reference to a hash bucket or anything like that.
my $value = $hash{name};
Is just a scalar.
my $ref = \$hash{name};
Is just a reference to a scalar. No more capable of containing information that allows you to back trace to a hash key than an anonymous reference can tell you what the name might be on the symbol table, or the lexical pad (without some help).
Try thinking about it like database tables. Have a user "table" / hash that associates a user id to information about the user, and have the other hashes use the user id, instead of the user's information.
my $userid = 5;
$user->{$groupid};
# would be the hash element for that user with a user id
You could then make your group lists use the numbers instead of names / usernames.
However, I think you are making more work for yourself than is needed. Have you actually run into a problem with this program using too much memory? Having duplicate keys is not a problem unless your keys contain very large size strings.
If you have a thousand different usernames (all 100 chars or less) and combined, there are 10,000 user / group relationships, then you only have:
100 bytes * 10,000 = 1MB
And to be honest, most names are 1/5 of that size: 200 KB
My suggestion would be to worry about this only if you have many MB of information (say 500 or more).
精彩评论