I have a cron job scrape.sh
that look like this:
#!/bin/bash
touch rage
cd /etc/myproject/scraper
scrapy crawl foosite --set FEED_URI=../feeds/foosite.xml --set FEED_FORMAT=xml
scrapy crawl barsite --set FEED_URI=../feeds/barsite.xml --set FEED_FORMAT=xml
When it executes the file rage does get created and judging from my syslog it does run as root, so permissions shouldn't be a problem.
May 6 17:35:01 server CRON[10233]: (root) CM开发者_运维知识库D (/etc/myproject/scraper/scrape.sh)
May 6 17:40:01 server CRON[17804]: (root) CMD (/etc/myproject/scraper/scrape.sh)
When I run scrape.sh
it executes as expected and puts the foosite.xml
file in the ../feeds
directory, the directory exist and is empty when the cron jobs starts. What can I do to solve this issue?
- If I were going to guess the problem it was an environment issue (e.g. scrapy is not in the path).
- To debug, make sure your cron job is sending the standard out and standard error to a log file/and or syslog
Maybe the command scrapy is not found? Cron jobs typically get a different shell environment than interactive shells, so perhaps scrapy is missing in your PATH and you should use /some/full/path/to/scrapy.
If that doesn't help, try redirecting stdout and stderr to some files, so you can see what the output is?
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html
精彩评论