开发者

BeautifulSoup.findAll() in perl

开发者 https://www.devze.com 2022-12-11 13:36 出处:网络
I need to pull out all of the \"NodeGroup\" elements out of an XML file: <Database> <Get> <Data>

I need to pull out all of the "NodeGroup" elements out of an XML file:

<Database>
  <Get>
    <Data>
      <NodeGroups>
        <NodeGroup>
          <AssociateNode ConnID="6748763_2" />
          <AssociateNode ConnID="6748763_1" />
          <Data DataType="Capacity">2</Data>
          <Name>Alp开发者_运维百科ha</Name>
        </NodeGroup>
        <NodeGroup>
          <AssociateNode ConnID="6748763_23" />
          <AssociateNode ConnID="6748763_7" />
          <Data DataType="Capacity">2</Data>
          <Name>Charlie</Name>
        </NodeGroup>
        <NodeGroup>
          <AssociateNode ConnID="6748763_98" />
          <AssociateNode ConnID="6748763_12" />
          <Data DataType="Capacity">2</Data>
          <Name>Papa</Name>
        </NodeGroup>
        <NodeGroup>
          <AssociateNode ConnID="6748763_8" />
          <AssociateNode ConnID="6748763_45" />
          <Data DataType="Capacity">2</Data>
          <Name>Yankee</Name>
        </NodeGroup>
      </NodeGroups>
      <System>
        ...
      </System>
    </Data>
  </Get>
</Database>

If I could use python and BeautifulSoup, I would parse the xml and call something like:

node_group_array = soup.findAll("nodegroups")

But I am using Perl and Perl's XML modules, so I used XML::Simple's XMLIn, recursively walking through each hash key, checking if the value was a hash, checking if it was the "NodeGroup" hash, etc.

I would think that there's something like soup.findAll() in one of Perl's XML modules, but I can't find it. How do I do "soup.findAll('nodegroups')" in Perl?


To clarify Randal's answer a bit, I think you want the XML::LibXML::XPathContext API provided by the XML::LibXML distribution:

my $xpath = XML::LibXML::XPathContext->new($document);
for my $node ( $xpath->find('//NodeGroup') { ... }


There is no "XML" module in Perl. There are many modules in the XML:: namespace. My favorite is XML::LibXML, but for something this simple, you could even use HTML::Parser in "xml-mode".


XML::DOM has getElementsByTagName (so do XML::LibXML::DOM and XML::GDOME) which works like the DOM function of the same name.


Using XML::Simple with the data file shown:

#!/usr/bin/perl

use strict; use warnings;

use XML::Simple;

my $db = XMLin($ARGV[0]);
my $nodegroups = $db->{Get}{Data}{NodeGroups}{NodeGroup};

use Data::Dumper;
print Dumper $nodegroups;

You might want to use the ForceArray => 1 option to guarantee consistency in case you have some files with multiple <NodeGroups>...</NodeGroups> sections and others with a single such section.

If the files are not too big, using XML::Simple should be fine. See also the caveats section in the documentation.

0

精彩评论

暂无评论...
验证码 换一张
取 消