开发者

Why is XML::LibXML->toString() showing expected XML but findnodes() finding more XML?

开发者 https://www.devze.com 2023-02-14 05:09 出处:网络
I have this subroutine that is passed in a chunk of xml and tries to locate some elements (using XML::LibXML and XPath):

I have this subroutine that is passed in a chunk of xml and tries to locate some elements (using XML::LibXML and XPath):

sub new_from_xml {
    my $class = shift;
    my ( $xml ) = @_;

    my $self = {};

    foreach (qw[ width height ]) {
        $self->{$_} = $xml->findnodes("//$_")->[0]->textContent;
    }

    $self->{type} = $xml->findnodes("//type")->[0]->textContent;

    $self->{url} = URI->new( $xml->findnodes("//url")->[0]->textContent );

    return $class->new( $self );
}

It's get called from here:

sub new_from_xml {
    my $class = shift;
    my ( $xml ) = @_;

    my $self = {};

    foreach (qw[id caption orientation]) {
        $self->{$_} = $xml->findnodes("//$_")->[0]->textContent;
    }

    $self->{alt} = $xml->findnodes('//htmlAlt')->[0]->textContent;

    foreach my $instance ( $xml->findnodes("//instance") ) {
        my $photo =
            WWW::NewsReach::Photo::Instance->new_from_xml( $instance );
        push @{$self->{instances}}, $photo;
    }
    return $class->new( $self );
}

What I expect is two <instance> ... </instance> blocks that findnodes() returns, then when I loop over I pass the first instance on the first call and the second instance on the second call.

This is what I see in the debugger (I'm in the first subroutine above, WWW::NewsReach::Photo::Inst开发者_如何学运维ance->new_from_xml).

DB<13> x $xml->toString
0     '<instance><width>100</width><height>66</height><type>Small</type><url>http://pictures.newsreach.co.uk/liveimages/Decor-tips-for-guaranteed-unsecured-loans-users.jpg</url></instance>'

Okay good that's what I expected.

DB<14> x $xml->findnodes("//type")->[0]->textContent
0  'Medium'

Wait, what? That wasn't present in the XML shown from toString. Where did this come from?

DB<15> x $xml->findnodes("//type")
0  XML::LibXML::Element=SCALAR(0x101d03780)
  -> 4334788352
1  XML::LibXML::Element=SCALAR(0x101cdd5c0)
  -> 4334949168

Hmm, so there are two <type> ... </type> elements.

DB<16> x $xml->toString;
0  '<instance><width>100</width><height>66</height><type>Small</type><url>http://pictures.newsreach.co.uk/liveimages/Decor-tips-for-guaranteed-unsecured-loans-users.jpg</url></instance>'

Umm, there's definitely only meant to be one <type> ... </type> element. What is going on here?

Why is toString showing one <instance> ... </instance> element but clearly the actual XML contains two <instance> ... </instance> elements? Any help would be greatly appreciated.


//foo will start searching from the document root. You want .//foo

0

精彩评论

暂无评论...
验证码 换一张
取 消