I'm running a site which allows users to create subdomains. I'd like to submit these user subdomains to search engines via sitemaps. However, according to the sitemaps protocol (and Google Webmaster Tools), a single sitemap can include URLs from a single host only.
What is the best approach?
At the moment I've the following structure:
- Sitemap index located at example.com/sitemap-index.xml that lists sitemaps for each subdomain (but located at the same host).
- Each subdomain has its own sitemap located at example.com/sitemap-subdomain.xml (this way the sitemap index includes URLs from a single host only).
- A sitemap for a subdomain contains URLs from the subdomain only, i.e., subdomain.example.com/*
- Each subdomain has subdomain.example.com/robots.txt file:
--
User-agent: *
Allow: /
Sitemap: http://example.com/sitemap-subdomain.xml
--
I think this approach complies to the sitemaps protocol, however, Google Webmaster Tools give errors for subdomain sitemaps: "URL not allowed. This url is not allowed for a Sitemap at this location."
I've also checked how other sites do it. Eventbrite, for instance, produces sitemaps that contain URLs from multiple subdomains (e.g., see http://www.开发者_运维百科eventbrite.com/events01.xml.gz). This, however, does not comply with the sitemaps protocol.
What approach do you recommend for sitemaps?
I recently struggled through this and finally got it working. See this thread for more details:
http://www.google.com/support/forum/p/Webmasters/thread?tid=53c3e4b3ab8d9503&hl=en&fid=53c3e4b3ab8d9503000497bd04ba63cf
Summary:
- Use DNS verification to verify your site and all it's subdomains in one fell swoop
- make the robots.txt on all your subdomains point to the main sitemap on your www domain
- You may need to wait several days for Google to update it's cached copies of robot.txt on all your subdomains. It will still show errors until then.
Yes, the subdomain restriction is in the sitemaps.org spec, but, Google has put some exceptions in place:
Verify all subdomains within your Google Webmaster tools account http://www.google.com/support/webmasters/bin/answer.py?answer=75712 cross-submission of sitemaps XML via Google Webmaster tools - if submitted via the root of your domain - will not throw errors for Google
Within the robots.txt of a subdomain you can point to sitemaps XML on other domains. there will be no cross submission errors - for Google
If you have a website that allows users to create sub-domain within your site, it is better for you to simplify the process by creating and submitting sitemaps for each subdomains by creating a single sitemap. This includes sitemap URLs for all your subdomain sites and saving this sitemap to a single location. But, to do this, all sites must be verified in webmaster tools. You can define one sitemap as:
http://example.com/sitemap.xml
Define all your sub-domain sitemaps for all your sub-domain URLs under this document tree.
You can define multiple sitemap files upto 50,000 URLs and 10 megabytes file size per sitemap. Sitemaps can be compressed using gzip to reduce the bandwidth. So, you don’t have any problem by defining the sitemap in this way.
精彩评论