开发者

parse html in adobe air

开发者 https://www.devze.com 2022-12-15 19:32 出处:网络
I am trying to load and parse html in adobe air. The main purpose being to extract title, meta tags and links. I have been trying the HTMLLoader but I get all sort of errors, mainly javascript uncaugh

I am trying to load and parse html in adobe air. The main purpose being to extract title, meta tags and links. I have been trying the HTMLLoader but I get all sort of errors, mainly javascript uncaught exceptions.

I also tried to load the html content directly (using URLLoader) and push the text into HTMLLoader (using loadString(...)) but got the same error. Last resort wa开发者_Go百科s to try and load the text into xml and then use E4X queries or xpath, no luck there cause the html is not well formed.

My questions are:

  1. Is there simple and reliable (air/action script) DOM component there (I do not need to display the page and headless mode will do)?
  2. Is there any library to convert (crappy) html into well formed xml so I can use xpath/E4X
  3. Any other suggestions on how to do this?

thx


ActionScript is supposed to be a superset of JavaScript, and thankfully, there's...

Pure JavaScript/ActionScript HTML Parser

created by Javascript guru and jQuery creator John Resig :-)

One approach is to run the HTML through HTMLtoXML() then use E4X as you please :)


Afaik:

  1. No :-(
  2. No :-(
  3. I think the easiest way to grab title and meta tags is writing some regular expressions. You can load the page's HTML code into a string and then read out whatever you need like this:

var str:String = ""; // put HTML code in here

var pattern:RegExp = /<title>(.+)<\/title>/i;

trace(pattern.exec(str));
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号