I've been told to understand how to maximize the visibility of an upcoming web application that is initially available in multiple languages, specifically French and English.
I am interested in understanding how the robots, like the google bot, scrapes a site that is available in multiple language.
I have a few questions concerning the behaviour of robots and indexing engines:
- Should a web site specify the language in the URL?
- Will a robot scrape a site in both language if the language is set through cookies (supposing a link that can change the language)?
- Should I use a distinct domain for each language?
- What meta tag could be used to help a robot in underst开发者_如何转开发anding the language of a web site?
- Am I missing anything that I should be aware of?
- Yes
- No
- Not necessarily, Google will infer the language. But if you use different TLD you probably get better exposure in specific countries, but you loss PageRank diluted in different domains.
<meta http-equiv="content-language" content="en">
- a. You should add a link in every page, to the same page in the other languages of the
site.
b. For SEO, it's better to use www.mysite.com/en/ that en.mysite.com because the PageRank is not diluted in different domains.
Should a web site specify the language in the URL?
No, not necessarily.
Will a robot scrape a site in both language if the language is set through cookies (supposing a link that can change the language)?
No. You should use a content-language
attribute as suggested by Eduardo. Alternatively, <html lang='en'>
will do the same job AFAIK.
What meta tag could be used to help a robot in understanding the language of a web site?
See above
Should I use a distinct domain for each language?
The Stack Overflow consensus (I'm sorry, I can't find for the life of me find the relevant questions! We had huge discussions on this, maybe they were closed as not programming related), is: Yes, have a different domain for each country if you want to maximize search engine visibility for that country.
精彩评论