I have written Perl code to actually create a Trie datastructure given a set of words in an array. Now I have problems traversing and printing the words.
Also pasted the Dumper output of the Datastructure created.
The final set of words after traversal doesn't seem to be right since the traversal logic is certainly missing something. But the trie creation is fine and works fast. Can someone help me here?
The top level of the trie is a hash
Each hash item has a key which is a letter and each hash points to a array ref.
Array ref again contains a list of hashes and each hash item is same as 1
If you see the first word in the output. It comes up as archtopriumwe.
We should get arc,arch,atop,atrium,awe
CODE
use Data::Dumper; my %mainhash; ## Subroutine sub storeword { my $type = shift; my $fc = shift; my $word = shift; return if ((not defined $word) or (length($word) == 0)); my @letters = split (//,$word); my $len = scalar(@letters) - 1; my ($arr_ref,$pass_ref,$flag ,$i,$hashitem,$newitem); $pass_ref = $hashitem = $new_item = undef; $arr_ref = $type; $setstop = 1 if (length($word) == 1); $flag =0; for($i = 0;$i {$letters[0]}) { $flag =1; $pass_ref = $hashitem->{$letters[0]}; last; } } if ($flag == 0) { $newitem->{$letters[0]} = []; push(@$arr_ref,$newitem); $pass_ref = $newitem->{$letters[0]}; } storeword($pass_ref,$letters[0],join ('',@letters[ 1..$len])); } ## Subroutine sub process { my ($prefix,$trie) = @_; for my $letter (sort keys %$trie) { if ( @{ $trie->{$letter} } ) { for my $branch (@{ $trie->{$letter} }) { process("$prefix$letter", $branch); } } else { print "$prefix$letter\n"; } } } ##main ##list of words my @wd = qw (arc atop awe blob boil fame tub arch atrium); ##inserting each word into the datastructure foreach my $w (@wd) { my @letters = split (//,$w); my $len = scalar(@letters) - 1; if (not exists $mainhash{$letters[0]}) { $mainhash{$letters[0]} = []; } storeword($mainhash{$letters[0]},$letters[0],join ('',@letters[ 1..$len])); } print Dumper(%mainhash); ## Trying to print each word from trie. print("\n List of words\n"); process('',\%mainhash);
Output:
$VAR1 = 'a'; $VAR2 = [ { 'r' => [ { 'c' => [ { 开发者_运维问答 'h' => [] } ] } ] }, { 't' => [ { 'o' => [ { 'p' => [] } ] }, { 'r' => [ { 'i' => [ { 'u' => [ { 'm' => [] } ] } ] } ] } ] }, { 'w' => [ { 'e' => [] } ] } ]; $VAR3 = 'b'; $VAR4 = [ { 'l' => [ { 'o' => [ { 'b' => [] } ] } ] }, { 'o' => [ { 'i' => [ { 'l' => [] } ] } ] } ]; $VAR5 = 'f'; $VAR6 = [ { 'a' => [ { 'm' => [ { 'e' => [] } ] } ] } ]; $VAR7 = 't'; $VAR8 = [ { 'u' => [ { 'b' => [] } ] } ]; List of words archtopriumwe bloboil fame tub
Do you see that your code as is is only printing each letter in the datastructure once, instead of once per word it is in? And only printing a newline once for each top-level letter in the tree, not one per word?
To fix this, you need to pass some more context into your recursive sub. Something like this:
sub process {
my ($prefix, $trie) = @_;
for my $letter (sort keys %$trie) {
if ( @{ $trie->{$letter} } ) {
for my $branch (@{ $trie->{$letter} }) {
process("$prefix$letter", $branch);
}
}
else {
print "$prefix$letter\n";
}
}
}
print("\n List of words\n");
process('', \%mainhash);
This doesn't print arc, because you provide no way to tell in your datastructure that arc is a word but e.g. boi is not. The value for each letter needs to provide two things: a boolean indicator that this is the end of a word, and a list of possible following letters and their sub-trie.
精彩评论