开发者

RSS escaped HTML

开发者 https://www.devze.com 2023-02-16 07:59 出处:网络
My understanding of RSS\'s \"escaped HTML\" is that something like this: HTML: 1 < 3 becomes (RSS):

My understanding of RSS's "escaped HTML" is that something like this:

HTML:

1 < 3

becomes (RSS):

1 < 3

So, then, should this:

<img src="http://somehost/开发者_StackOverflow社区someimage?a=foo&amp;b=bar" />

Become:

&lt;img src="http://somehost/someimage?a=foo&amp;amp;b=bar" /&gt;

(Note the &amp;amp; If yes, is this then invalid RSS?

<description>
    ...
    &#60;img src="http://d.yimg.com/a/p/ap/20110309/capt.f6...02-0.jpg?x=91&amp;y=130&amp;q=85&amp;sig=6oI7fIgN0izc9olfgY56vw--" />
</description>

(Additionally, is the fact that the closing > isn't escaped bad?)

The problem with the above <description> that I'm having is that once you decode the first layer of entities (XML) to arrive at the contents of the <description> tag, you get one long CDATA, which should be HTML. The problem is that the <img> has just a &, which is an invalid entity. For the massive chunk above, I get something like <img src="....?x=1&y=2" />, which isn't valid HTML.

Am I just looking at crappy HTML that got shoved into RSS, or am I missing something here?


you need to use CDATA Sections

<description><![CDATA[ <img src="http://somehost/someimage?a=foo&amp;b=bar" /> ]]>
</description>
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号