I'm working on a PHP application that will cycle through a list of URLs checking to see if the link is valid or not. The way I accomplish this is by opening the URL using the php function file_get_contents. I then search for a certain string value within the page source in order to determine if the link is good or bad. So in testing the application, towards the end of the day, whenever I would try to check a URL on this website I w开发者_如何学Goould get this message:
failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in...
The message is a bit longer containing information about the location of my code but this part is the part that stood out to me. I'm thinking that maybe the companies router/firewall thinks I'm trying to spam/attack them based off what I have been Googling. I'm wondering if I might be on some permanent "blacklist" or something like that and how would I find out? I wasn't trying to do anything bad. Actually, what I'm doing will help out this company as I'm doing something that will help to generate sales. Total accident :-) I'm going to call the company later and ask them about it.
Many sites block access from user agents that fail to identify themselves. Introduce yourself properly and you're likely to get better service.
ini_set('user_agent', "CharlesUserAgent1.0"); // Anything usually should do as long as it's not blank
EDIT: You may also want to check out cURL, it does a much better job at making HTTP requests than PHP's builtin URL fopen wrappers.
- it could be the websites check the user_agent header and then blocks your request.
- Some URLs could be having a query string and the file_get_contents might not be able to perform your request as a normal browser would, thus the page you will be requesting could be something which is actually forbidden :/
Browse the URLs manually and see if you get the same error
精彩评论