开发者

Check servers for active Webserver fast (multithreaded)

开发者 https://www.devze.com 2023-03-11 13:00 出处:网络
I want to check an huge amount (thousands) of Websites, if they are still running. Because I want to get rid of unececarry entries in my HostFile Wikipage about Hostfiles.

I want to check an huge amount (thousands) of Websites, if they are still running. Because I want to get rid of unececarry entries in my HostFile Wikipage about Hostfiles. I want to do it in a 2 Stage process.

  1. Check if something is running on Port 80
  2. Check the HTTP response code (if it's not 200 I have to check the site)

I want to multithread, because if I want to check thousands of addresses, I cant wait for timeouts. This question is just about Step one.

I have the problem, that ~1/4 of my connect attempts don't work. If I retry the not working ones about ~3/4 work? Do I not close the Sockets correctly? Do I run into a limit of open Sockets? Default I run 16 threads, but I have the same problems with 8 or 4. Is there something I'm missing

I have simplified the code a little. Here is the code of the Thread

public class SocketThread extends Thread{

  int tn;
  int n;
  String[] s;
  private ArrayList<String> good;
  private ArrayList<String> bad;

  public SocketThread(int tn, int n, String[] s) {
    this.tn = tn;
    this.n = n;
    this.s = s;
    good = new ArrayList<String>();
    bad = new ArrayList<String>();
  }

  @Override
  public void run() {
开发者_JS百科    int answer;
    for (int i = tn * (s.length / n); i < ((tn + 1) * (s.length / n)) - 1; i++) {
      answer = checkPort80(s[i]);
      if (answer == 1) {
        good.add(s[i]);
      } else {
        bad.add(s[i]);
      }
      System.out.println(s[i] + " | " + answer);
    }
  }
}

And here is the checkPort80 Method

public static int checkPort80(String host) 
  Socket socket = null;
  int reachable = -1;
  try {
    //One way of doing it
    //socket = new Socket(host, 80);
    //socket.close();

    //Another way I've tried
    socket = new Socket();
    InetSocketAddress ina = new InetSocketAddress(host, 80);
    socket.connect(ina, 30000);
    socket.close();
    return reachable = 1;
  } catch (Exception e) {
  } finally {
    if (socket != null) {
      if (socket.isBound()) {
        try {
          socket.close();
          return reachable;
        } catch (Exception e) {
          e.getMessage();
          return reachable;
        }
      }
    }
  }
}

About Threads, I make a ArrayList of Threads, create them and .start() them and right afterwards I .join() them, get the "God" and the "Bad" save them to files.

Help is appreciated.

PS: I rename the Hosts-file first so that it doesn't affect the process, so this is not an issue.

Edit:

Thanks to Marcelo Hernández Rishr I discovered, that HttpURLConnection seems to be the better solution. It works faster and I can also get the HttpResponseCode, which I was also interested anyways (just thought it would be much slower, then just checking Port 80). I still after a while suddenly get Errors, I guess this has to do with the DNS server thinking this is a DOS-Attack ^^ (but I should examine futher if the error lies somewhere else) also fyi I use OpenDNS, so maybe they just don't like me ^^. x4u suggested adding a sleep() to the Threads, which seems to make things a little better, but will it help me raise entries/second i don't know.

Still, I can't (by far) get to the speed I wanted (10+ entries/second), even 6 entries per second doesn't seem to work. Here are a few scenarios I tested (until now all without any sleep()).

number of  time i get first round  how many entries where  entries/second
threads    of errors               processed until then
10         1 minute 17 seconds     ~770 entries            10
8          3 minute 55 seconds     ~2000 entries           8,51
6          6 minute 30 seconds     ~2270 entries           5,82

I will try to find a sweet spot with Threads and sleep (or maybe simply pause all for one minute if I get many errors). Problem is, there are Hostfiles with one million entries, which at one entry per second would take 11 Days, which I guess all understand, is not expectable. Are there ways to switch DNS-Servers on the fly? Any other suggestions? Should I post the new questions as separate questions?

Thanks for the help until now. I'll post new results in about a week.


I have 3 suggestions that may help you in your task.

  1. Maybe you can use the class HttpURLConnection
  2. Use a maximum of 10 threads because you are still limited by cpu, bandwidth, etc.
  3. The lists good and bad shouldn't be part of your thread class, maybe they can be static members of the class were you have your main method and do static synchronized methods to add members to both lists from any thread.


Sockets usually try to shut down gracefully and wait for a response from the destination port. While they are waiting they are still blocking resources which can make successive connection attempts fail if they were executed while there have still been too many open sockets.

To avoid this you can turn off the lingering before you connect the socket:

socket.setSoLinger(false, 0);
0

精彩评论

暂无评论...
验证码 换一张
取 消