i have over 20k links to check. those are rapidshare and fileserve links. right now i'm using 'file_get_content' and check if a fileserve link content contain 'This file was either in breach of a copyright holder or deleted by the uploader.' to mark it as 'deleted' and 'File not found.' for rapidshare link.
if (strpos($var2, "This file was either in breach of a copyright holder or deleted by the uploader.")) {...
if (strpos($var, "File not found.")) {...
thing is that file_get_content is really really slow even when i added this:
$context = s开发者_JS百科tream_context_create(array('http' => array('header'=>'Connection: close')));
$var = file_get_contents($url,false,$context);
if there any other alternative way to do it faster? script is running for over an hour and i have 5k links checked...
Do you try multi-thread and other language like C to do that check?
Both these service provide API to the public, read their docs.
精彩评论