Good day,
I have a simple MySQL database with 1 table and 3 fields
Table: LINKS
ID
URL
STATUS
The开发者_开发问答 table has about 3 millions links.
I would like to check all the URLs and post their returned status in the status field so that I can remove the dead links later.This would probably require a shell script because it will need to run for a long time.
I think CURL headers may provide the best method for checking the status code, but I don't know how to put this all together. Any help on the above or a suggestion for a better way to handle this would be greatly appreciated.Thank you.
I would rather do this in batches by, say, thousand and instead doing this in bash, I'd do it in PHP or Perl (or any other scripting language of your choice, e.g. Python).
PHP has fopen which would do the job of CURL so you don't have to spawn a separate syscall for each link check. MySQL connectivity is almost native in both PHP and Perl, too.
Following script can help you in getting the status,haven't done sql with this so:
for URL in //get urls from mysql
do
STATUS=$(curl -s -o /dev/null -w '%{http_code}' $URL)
//set status value in "status" in mysql
done
精彩评论