I have this webpage that uses client-side JavaScript to format data on the page before it's displayed to the user.
开发者_如何学GoIs it possible to somehow use wget
to download the page and use some sort of client-side JavaScript engine to format the data as it would be displayed in a browser?
You could probably make that happen with something like PhantomJS
You can write a phantomjs script that will load the page like a browser would, and then either take screenshots or use JS to inspect the page and pull out data.
Here is a simple little phantomjs script that triggers javascript on a webpage and allows you to pull it down locally:
file: get.js
var page = require('webpage').create(),
system = require('system'), address;
address = system.args[1];
page.scrollPosition= { top: 4000, left: 0}
page.open(address, function(status) {
if (status !== 'success') {
console.log('** Error loading url.');
} else {
console.log(page.content);
}
phantom.exit();
});
Use it as follows:
$> phantomjs /path/to/get.js "http://www.google.com" > "google.html"
Changing /path/to
, url
and filename
to what you want.
Not with wget, as I doubt it includes any form of a JavaScript engine. However, you could use WebKit to process the page, and thus the output.
Using things like this as a base for how to get the content: http://situated.wordpress.com/2008/06/04/take-screenshots-of-a-website-from-the-command-line/
精彩评论