My ultimate goal is to build a web crawler capable of downloading all of the images on a webpage. My understanding from the reading I've done is that I need to embed a rendering/layout engine such as Gecko or Webkit.
Unfortunately, I'm running windows, so PyWebkit is out and short learning C++ for Gecko or Java to use Rhino, I'm not sure where to turn.
Is there a reliable rendering开发者_开发知识库 engine with python bindings that will work in windows (64-bit, Windows 7)? Is there an easy way to execute javascript within a python script on windows?
You don't need Webkit to do that. All you need it an engine to run Javascript code, so take a look at Gogole V8 or Mozilla SpiderMonkey.
If you're prefer Python to build your crawler, you may want to use PyV8 as it provides all necessary bindings.
精彩评论