开发者

Any way to run Firefox with GreaseMonkey scripts without a GUI/X session

开发者 https://www.devze.com 2022-12-17 11:43 出处:网络
I need to build a small \"monitoring\" scraper for a 3rd party website (it\'s an external website that has stats about our visitors).

I need to build a small "monitoring" scraper for a 3rd party website (it's an external website that has stats about our visitors).

Unfortunately, this website is very hard to scrape through the normal "wget" mechanism, because it uses a ton of sophisticated JS, part of it generated by GWT. So my workaround was to create a GreaseMonkey script and then hav开发者_开发技巧e this script call a PHP page that would log the scraped data. Then as soon as Firefox starts with this webpage-to-scrape, the script goes to work.

This works well, but now I am trying to make it more robust as far as monitoring tools go. I want it to run on the server using a cron job. As far as I understand such things, this requires a DISPLAY variable to be set and for an X session to exist (Firefox is refusing to run for me). Is there any nice way to allow it to run from the batchuser account as a cron job?


I've done something similar to get Selenium running headless on a server. I used Xvfb.

http://en.wikipedia.org/wiki/Xvfb

This article has some tips for using Xvfb with Firefox:

http://semicomplete.com/blog/geekery/xvfb-firefox.html


The best way to do that is to build Firefox in the headless mode: http://hg.mozilla.org/incubator/offscreen

0

精彩评论

暂无评论...
验证码 换一张
取 消