I'm working with a third party vendor at the moment who has supplied an ASP.Net web application. The web app generates around 200 unhandled exceptions per day, which end up as emails in my in-box. Upon investigation it turns out that most of these errors are triggered by the GoogleBot web crawler indexing the site and triggering access to another third party web service, which is rate-limiting the requests. When a request limit is exceeded, the third party web service refuses the request, this results in an unhandled exception in the web server and an HTTP/500 status code. The exception looks like this:
Exception: Exception of type 'System.Web.HttpUnhandledException' was thrown., Stack Trace: at System.Web.UI.Page.HandleError(Exception e) at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) at System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) at System.Web.UI.Page.ProcessRequest() at System.Web.UI.Page.ProcessRequest(HttpContext context) at ASP.views_products_detail_aspx.ProcessRequest(HttpContext context) at System.Web.Mvc.ViewPage.RenderView(ViewContext viewContext) at System.Web.Mvc.ViewResultBase.ExecuteResult(ControllerContext context) at System.Web.Mvc.ControllerActionInvoker.c__DisplayClass11.b__e() at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultFilter(IResultFilter filter, ResultExecutingContext preContext, Func`1 continuation) at System.Web.Mvc.ControllerActionInvoker.InvokeActionResultWithFilters(ControllerContext controllerContext, IList`1 filters, ActionResult actionResult) at System.Web.Mvc.ControllerActionInvoker.InvokeAction(ControllerContext controllerContext, String actionName)
The web app developer seems unwilling to handle these errors for reasons I don't really understand. Their approa开发者_StackOverflowch is to throttle the GoogleBot until the errors stop happening (Google indexes quite aggressivley, generating around 5,000 hits per day). While I accept that throttling the GoogleBot would work, it seems like a cop-out to me. I've always considered unhandled exceptions to be bugs. Shouldn't the web app handle these errors? It is ever acceptable to allow an HTTP/500 to happen? What do the web developers out there think?
There's really several questions here: should the web site display exceptions (No), should the web site display something more friendly to users (Yes), should the web site return a 500 error to Googlebot when it can't proceed (Probably), should you ask Googlebot to slow down (Yes), should you deliver 500 exception emails a day with no throttling or summarization (Probably not).
More detail:-
Using google.com/webmasters you can request that Google indexes your site less aggressively.
You should never show an Exception to users, you should always catch it and display a friendly error page BUT you need to be careful to preserve the HTTP code when you display that page (e.g. 404 or 500) since if you return a page with code = 200 then that error page will find its way into search engine indexes.
Any error handler should rate limit how often it sends email messages when errors occur.
A well written error handler should also allow the suppression of errors you know will happen - e.g. certain search engines insist on requesting pages that do not exist.
If Google can get you into a rate limit situation then high traffic from users may be able to do the same so overall it sounds like you need some kind of caching solution here too.
No, it's not acceptable. The web app should, at the very least, catch exceptions in global.asax.cs and do something reasonable with them.
The web developer is unwilling to handle the exception in a web service?
That's rather frightening, as there are many, many reasons why a web service might cause an exception. Instead of asking to fix the GoogleBot issue, try asking the developer to allow the application to degrade gracefully if the web service is unavailable.
IMO this is actually rather acceptable. The problem is you have reached your service saturation point as it stands. There are 2 ways to handle this: either spend the money and time required to increase the saturation point and be allocated more service or spend the time and money to make yourself not be dependent on said service.
Edit: There is a 3rd option which would be to follow foot steps of the NY Times treating Google as a thief of your services and just ban them. Of course this is really just like putting your head in the sand but it's an option.
The web app is handling the errors by e-mailing you.
What I am wondering is, what if a real human site visitor experience this error, how would you want it handled? I imagine you would want to get e-mailed if a real human visitor experienced this error. Maybe some of these errors are from real human visitors because Google used up their quota?
It seems you need a more graceful way overall of making sure the site scales without the third-party service. It's not clear to me which vendor (the ASP.NET developer or the third party service) should work on it but that's more of a project management issue.
A situation outside the control of the vendor has occurred. The vendor has logged the error and sent a notification to you. The vendor cannot control the limiting on the web service or the number of visitors to your site. You are in control of both.
Unless the specification stated that the emails should be throttled in some way the vendors position is sound. You want work done you pay for it.
So you have three choices
- Throttle google in some way;
- Extend your web service allowance; or
- Pay for the website to be altered.
精彩评论