开发者

ASP.NET counting visitors, not bots

开发者 https://www.devze.com 2023-01-11 06:35 出处:网络
I have got an ASP.Net 4 web site. I\'m counting visitors at background but my code counts search engine\'s bots too. How can I understand my client is a bot or human?I don\'t want to count bot开发者_如

I have got an ASP.Net 4 web site. I'm counting visitors at background but my code counts search engine's bots too. How can I understand my client is a bot or human? I don't want to count bot开发者_如何学Pythons.

Regards


You can use the Crawler property of Request.Browser to filter search engine bots.


You could check the User Agent and then look for the Type R which is a robot or crawler.

See http://www.user-agents.org for more info.

I am sure there are cases where the bots are not following standards and you might have to one off those.


Your best bet is probably checking the client's user agent:

http://support.microsoft.com/kb/306576

There may even be a quick little library out there for .NET with a lot of well known user agents or good regexps to use. Note that some bots will send fake user agents to make it look like they're people, some people's browsers may send empty or unknown user agents, etc. But those cases should be few and far between. For the most part this should get you pretty good statistics.


You can try and inspect the User Agent in the message header, for starters. A malicious bot will fake that, though. A more labor intensive approach is to log/inspect your IP visits programmatically (look in the web log files, or collect them yourself) and try to deduce which of them are bots based on frequency of visits, etc. Quite a cat and mouse game.


if you want to block crawlers from accessing certain links, create a Robots.txt file in your root directory, with something like:

User-agent: *
Disallow: / // blocks the default route / page
Disallow: /MyPage.aspx

check

http://en.wikipedia.org/wiki/Robots_exclusion_standard

&

http://www.google.com/#hl=en&q=robots.txt

0

精彩评论

暂无评论...
验证码 换一张
取 消