I have a website that must have javascript turned on so it can work
there is a < noscript> tag that have a meta to redirect the user to a page that alerts him ab开发者_高级运维out the disabled javascript...
I am wondering, is this a bad thing for search engine crawlers?
Because I send an e-mail to myself when someone doesn't have js so I can analyze if its necessary to rebuild the website for these people, but its 100% js activated and the only ones that doesn't have JS are searchengines crawlers... I guess google, yahoo etc doesn't take the meta refresh seriously when inside a < noscript> ?Should I do something to check if they are bots and do not redirect them with meta?
Thanks,
JoeInstead of forcefully sending the user/bot why not just make text appear at the top of the page stating to enable javascript in order to use the site?
This will allow the bots to still read the page and follow the non-javascript links. This would end the problems with being redirected and there would be no need to serve bots a different page. Which would make you update multiple pages.
You may also want to take a look at google webmaster tools to see what all google is currently reading and improve based on that.
Example: disabling javascript on SO creates a red banner at the top that just states "Stack Overflow works best with JavaScript enabled" you could make that linkable to a page with more info if you feel its not enough.
Have you tried <!--googleoff: all--> <noscript><meta redirect... /></noscript><!--googleon: all-->
? Its not a complete solution but its worth a shot...
Here is what i would do:
- Make it so that the site somewhat works with javascript. if you use ajax all over the place, then make sure that the links have href set to the url you will ajax in. This might get your site to "somewhat" work without javascript.
- Add some .htaccess redirects for the bots. redirect them to some sane place where they can go to some links and index some stuff
Your site as it stands is probably very bad in terms of crawl-ability and SEO.
edit: ok, i see your problem. The crawlers get redirected away after seeing the stuff inside noscript.
how about this solution then:
if you have just one page that has the noscript, then you can add some rewrite rules to your apache config that will show a different version of the page to the bots, and this version will not have the noscript tag. for example:
RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} msnbot [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp
RewriteRule ^.*$ nometa.html [L]
Also, what technologies are you using? are you using any server side languages, are you even using apache? i assumed you have apache+html but no server side language. If you do have something running server side, then this is easier.
Since <meta> isn't allowed in the <body> of a page, and <noscript> isn't legal in the <head> section, perhaps the bots are just giving up on a page where they hit bad HTML.
I suggest you simply use a <noscript> tag to encapsulate a warning message and a link that the user can click on if they do not have Javascript switched on.
Search engines can be prevented from following this link using the /robots.txt file, or by placing a
<meta name="ROBOTS" content="NOINDEX,NOFOLLOW" />
tag on the page which is linked to.
You could have a page that says "You need javascript" on it. And then add on that page
<script>
window.location.href='/thejspage.html';
</script>
That way, people with javascript support will be easily sent to the valid page, and the spiders will just stay on that page, instead of saving a page where there is no javascript.
This should also help your SEO (as the search engines will find a page that regular users can see).
Maybe you could make use of a headless browser and could serve the HTML snapshot of the page for those who don't have javascript enabled, including the crawlers.
http://code.google.com/web/ajaxcrawling/docs/getting-started.html
精彩评论