开发者

How do I script html that is not well formed to be well formed using classic asp and vbscript?

开发者 https://www.devze.com 2023-02-17 14:51 出处:网络
I am trying to parse some html to switch out values of various element attributes. I decided that the most reliable way to parse the html was to use an xml parser (msxml.)

I am trying to parse some html to switch out values of various element attributes. I decided that the most reliable way to parse the html was to use an xml parser (msxml.)

The problem is that the html I'm trying to parse contains attribute like:

<param name="fla开发者_开发问答shvars" value="autoplay=false&amp;brand=embed&amp;cid=97%2Ftest&amp;locale=en_US"/>

Which causes the xml parser to blow up. I figured out that I need to server.htmlencode() the value attribute in order for the xml parser to load it properly. How do I approach this?

I feel like the problem is a vicious circle. I couldn't use regex's because html is not regular enough, and now I can't use xml parsers because the html isn't "well formed"

help. How do I approach this issue? I want to be able to change attribute values with a vbscript.


Is your HTML well formed? If so you could simply use an XML DomDocument. Use XPath to find the attributes you want to replace.

You can actually use JScript serverside as well in ASP, whicdh might give you access to HTMLDom libraries you could use.

You should probably have a look at one of the libraries for cleaning up HTML, something like HTML Tidy http://www.w3.org/People/Raggett/tidy/

Your main problem is you need to do a replace on the ampersands, they need to be &amp; in well formed XML/XHTML.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号