I am trying to extract specific content(links, text, images) from an HTML page. Is there some program out there that I can use to produce a visual repr开发者_StackOverflow社区esentation of the DOM model of the page? I know I could I write such a program in Java using an HTML parser, but before I do that, I thought I would see if there already exists such a program.
My main objective is to extract certain links, image URLs, and text; and send these to a Flex applet on the page. Thanks, Vance
If you just want to extract a few bits of information (rather than print out the entire page structure say) the you can use the FireBug extension for Firefox.
Choose the HTML tab then click on the second icon from the left (looks like a cursor pointing at a box) then click on the part of the page you're interested in to go to that part of the DOM.
I think your best bet would be jQuery and GreaseMonkey... GreaseMonkey would insert the script, and jQuery can efficiently parse the HTML DOM. Note that this is possibly FireFox only solution, since I THINK GreaseMonkey is a FireFox only utility.
精彩评论