I'm using HTMLAgilityPack in a parser that I have up on a server, but I'm having issues with one of the websites that I'm parsing: Every day around 6am they tend to shut down their servers for maintenance, which throws off the Load() method for HTMLWeb, and makes my app crash. Do any of you guys have a more secure way of loading a website into HTMLAgilityPack, or maybe some way to do error checking in C# to prevent my app from crashing? (my c# is a little rusty开发者_开发百科). Here is my code right now:
HtmlWeb webGet = new HtmlWeb();
HtmlDocument document = webGet.Load(dealsiteLink); //The Load() method here stalls the program because it takes 1 or 2 minutes before it realizes the website is down
Thank you!
Just surround the call with a try-catch:
HtmlWeb webGet = new HtmlWeb();
HtmlDocument document;
try
{
document = webGet.Load(dealsiteLink);
}
catch (WebException ex)
{
// Logic to retry (maybe in 10 minutes) goes here
}
The exact retry logic will depend on how your application is structured - you will probably find that the try-catch
block needs to be placed higher up in your application needs to go much higher up than this.
I think WebException
is the exception you should catch, but I can't be sure because I can't find the documentation. You might find that you also need to catch TimeoutException
.
Try doing a WebRequest.GetReponse on the websites homepage and catch a WebException, if you get WebException may be give some time and try again until you get a response back, once you get a response then proceed with HtmlAgilityPack's load method.
Check this
http://msdn.microsoft.com/en-us/library/system.net.webrequest.getresponse.aspx#Y700
精彩评论