开发者

wget + JavaScript?

开发者 https://www.devze.com 2023-03-02 22:20 出处:网络
I have this webpage that uses client-side JavaScript to format data on the page before it\'s displayed to the user.

I have this webpage that uses client-side JavaScript to format data on the page before it's displayed to the user.

开发者_如何学Go

Is it possible to somehow use wget to download the page and use some sort of client-side JavaScript engine to format the data as it would be displayed in a browser?


You could probably make that happen with something like PhantomJS

You can write a phantomjs script that will load the page like a browser would, and then either take screenshots or use JS to inspect the page and pull out data.


Here is a simple little phantomjs script that triggers javascript on a webpage and allows you to pull it down locally:

file: get.js

var page = require('webpage').create(),
  system = require('system'), address;

address = system.args[1];
page.scrollPosition= { top: 4000, left: 0}  
page.open(address, function(status) {
  if (status !== 'success') {
    console.log('** Error loading url.');
  } else {
    console.log(page.content);
  }
  phantom.exit();
});

Use it as follows:
$> phantomjs /path/to/get.js "http://www.google.com" > "google.html"

Changing /path/to, url and filename to what you want.


Not with wget, as I doubt it includes any form of a JavaScript engine. However, you could use WebKit to process the page, and thus the output.

Using things like this as a base for how to get the content: http://situated.wordpress.com/2008/06/04/take-screenshots-of-a-website-from-the-command-line/

0

精彩评论

暂无评论...
验证码 换一张
取 消