开发者

Why are entities in libxml2 SAX-parsed attribute values encoded?

开发者 https://www.devze.com 2023-02-11 12:49 出处:网络
I\'m fetching the value of an XML entity in an libxml2 SAX parser similarly 开发者_运维百科to how the ansewr to this question suggests. Specifically, my code looks like so (attributes[i].value is *xml

I'm fetching the value of an XML entity in an libxml2 SAX parser similarly 开发者_运维百科to how the ansewr to this question suggests. Specifically, my code looks like so (attributes[i].value is *xmlChar):

    int valueLength = (int) (attributes[i].end - attributes[i].value);
    value = [[[NSString alloc] initWithBytes:attributes[i].value
                                      length:valueLength
                                    encoding:NSUTF8StringEncoding
    ] autorelease];

However, for some reason, when the attribute value (a URL in this case) has the entity & in the original XML, the value I get has &#38.

Say what?

How do I get libxml2 to decode attribute entities (it seems to do it fine for text node entities), so that I just get &?


libxml2 does not replace entities by default, you have to turn that on when you create the xmlReader.

This code has an example

http://xmlsoft.org/examples/reader2.c

The docs for XML_PARSE_NOENT are here;

http://xmlsoft.org/html/libxml-parser.html

Although it has been a while since I used the entity bits from libxml2 I recall having to do something to get the default entity resolver in place. Docs on that here;

http://xmlsoft.org/xmlio.html

If this does not wrap it up please ping me back and I'll look in the source for Foto Brisko, I had to handle it there...

Although the blog post is long winded I think the sample from here

http://bill.dudney.net/roller/objc/entry/libxml2_push_parsing

might have the entity stuff turned on as well but its been so long I've forgotten and I don't have time right now to go back through it.

Good luck!

0

精彩评论

暂无评论...
验证码 换一张
取 消