开发者

Django sitemap intermittent www

开发者 https://www.devze.com 2023-01-01 03:49 出处:网络
The automatic sitemap for my Django site fluctuates between including the www on urls and leaving it out (I\'m aiming to have it in all the time). This has ramifications in google not indexing my page

The automatic sitemap for my Django site fluctuates between including the www on urls and leaving it out (I'm aiming to have it in all the time). This has ramifications in google not indexing my pages properly so I'm trying to narrow down what would be causing this issue.

I have set PREPEND_WWW = True and my site record in the sites framework is set to include the www e.g. it's set to www.example.com as opposed to example.com. I'm using memcached but pages should expire from the cache after 48 hours so I wouldn't have thought that would be causing the issue?

You can see the problem in effect at http://www.livingspaceltd.co.uk/sitemap.xml (refresh the page a few times).

My sitemaps setup is fairly prosaic so I'm doubtful that that is the issue, but in case it's something obvious I'm missing here's the code:

***urls.py***

sitemaps = {
    'subpages': Subpages_Sitemap,
    'standalone_pages': Standalone_Sitemap,
    'categories': Categories_Sitemap,
}

urlpatterns = patterns('',
    (r'^sitemap\.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps}),
    ...

***sitemaps.py***

# -*- coding: utf-8 -*- 
from django_ls.livingspace.models import Page, Categ开发者_StackOverflowory, Standalone_Page, Subpage
from django.contrib.sitemaps import Sitemap

class Subpages_Sitemap(Sitemap):
    changefreq = "monthly"
    priority = 0.4
    def items(self):
        return Subpage.objects.filter(restricted_to__isnull=True)

class Standalone_Sitemap(Sitemap):
    changefreq = "weekly"
    priority = 1
    def items(self):
        return Standalone_Page.objects.all()

class Categories_Sitemap(Sitemap):
    changefreq = "weekly"
    priority = 0.7
    def items(self):
        return Category.objects.all()


PREPEND_WWW = True in settings.py must appear above your caching variable settings. This fixed my problem which is just the same with yours. I ran into this same problem when i submit my sitemap in google webmaster tool.


It might be one of the weirdest problem I've seen. But the thing is the way Django constructs URLs in sitemap is extremely straightforward. It just gets curent Site object from the database and appends value of the "domain" field to page's relative location:

current_site = Site.objects.get_current()
...
loc = "http://%s%s" % (current_site.domain, self.__get('location', item))

(source)

Are you sure you are not doing anything weird on a database level? If you had multiple mirrored databases, but they weren't consistant it could produce a similar effect. Try setting up a test view that just displays Site.objects.get_current(). It will probably fluctuate as well.

If you use any third-party caching app (like Johnny Cache) try turning it off.

Also, make sure you don't have two Site objects - one with, and one without www (it shouldn't give you a similar effect, but with multiple server instances, configured for different SITE_ID's... maybe?)


Well, it look like it was a caching error after all - I'm not quite sure wht was wrong, as I had made the changes over a week ago, so it defintely wasn't behaving right and I had to try a couple of diffrent methods to restart it. So that bears some deeper investigation, but it's working now.

0

精彩评论

暂无评论...
验证码 换一张
取 消