I read that spammers m开发者_开发技巧ay be downloading a specific registration page on my site using curl. Is there any way to block that specific page from being CURLed, either through htaccess or other means?
I don't think this is possible to block curl, as curl has the ability to send user agents, cookies, etc. As far as I understand, it can completely emulate a normal user.
If you are worried about protecting a form, you can generate a random token which is submitted automatically when the form is submitted. That way, anyone who tries to make a script to automate registration will have to worry about scraping it first.
There is one weakness in CURL, which you can exploit, it can not run javascript like a browser. So you can take advantage of this fact, one first landing on the reg page, have your server side code check for a cookie, if it isnt there, send some javascript code to the browser, this code will set the cookie and do a redirect/reload ... after reload the server side again checks for the cookie, incase of browsers it will find it.. incase of curl the cookie generation and reload/redirect wont happen in the first place.
I hope i made some sense, bottom line .. utilize javascript to differentiate between curl and browser.
As Oren says, spammers can forge user-agents, so you can't just block the curl user-agent string. The typical solution here is some kind of CATPCHA. These are often jumbled images (though non-visual forms exist) sites (including StackOverflow) have you transcribe to prove you're human.
精彩评论