Perhaps this is completely normal behaviour, but I feel like the django_session
table is much larger than it should have to be.
First of all, I run the following cleanup command daily so the size is not caused by expired sessions:
DELETE FROM %s WHERE expire_date < NOW()
The numbers:
- We've got about 5000 unique visitors (bots excluded) every day.
- The
SESSION_COOKIE_AGE
is set to the default, 2 weeks - The table has a little over 1,000,000 rows
So, I'm guessing that Django also generates session keys for all bots that visits the site and that the bots don't store the cookies so it continuously generates new cookies.
But... is this normal behaviour? Is there a setting so Django won't generate sessions for anonymous users, o开发者_开发问答r atleast... no sessions for users that aren't using sessions?
After a bit of debugging I've managed to trace cause of the problem.
One of my middlewares (and most of my views) have a request.user.is_authenticated()
in them.
The django.contrib.auth
middleware sets request.user
to LazyUser()
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/middleware.py?rev=14919#L13 (I don't see why there is a return None
there, but ok...)
class AuthenticationMiddleware(object):
def process_request(self, request):
assert hasattr(request, 'session'), "The Django authentication middleware requires session middleware to be installed. Edit your MIDDLEWARE_CLASSES setting to insert 'django.contrib.sessions.middleware.SessionMiddleware'."
request.__class__.user = LazyUser()
return None
The LazyUser
calls get_user(request)
to get the user:
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/middleware.py?rev=14919#L5
class LazyUser(object):
def __get__(self, request, obj_type=None):
if not hasattr(request, '_cached_user'):
from django.contrib.auth import get_user
request._cached_user = get_user(request)
return request._cached_user
The get_user(request)
method does a user_id = request.session[SESSION_KEY]
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/init.py?rev=14919#L100
def get_user(request):
from django.contrib.auth.models import AnonymousUser
try:
user_id = request.session[SESSION_KEY]
backend_path = request.session[BACKEND_SESSION_KEY]
backend = load_backend(backend_path)
user = backend.get_user(user_id) or AnonymousUser()
except KeyError:
user = AnonymousUser()
return user
Upon accessing the session sets accessed
to true:
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/sessions/backends/base.py?rev=14919#L183
def _get_session(self, no_load=False):
"""
Lazily loads session from storage (unless "no_load" is True, when only
an empty dict is stored) and stores it in the current instance.
"""
self.accessed = True
try:
return self._session_cache
except AttributeError:
if self._session_key is None or no_load:
self._session_cache = {}
else:
self._session_cache = self.load()
return self._session_cache
And that causes the session to initialize. The bug was caused by a faulty session backend that also generates a session when accessed
is set to true...
Is it possible for robots to access any page where you set anything in a user session (even for anonymous users), or any page where you use session.set_test_cookie()
(for example Django's default login view in calls this method)? In both of these cases a new session object is created. Excluding such URLs in robots.txt should help.
For my case, I wrongly set SESSION_SAVE_EVERY_REQUEST = True
in settings.py
without understanding the exact meaning.
Then every request to my django service would generate a session entry, especially the heartbeat test request from upstream load balancers. After several days' running, django_session
table turned to a huge one.
Django offers a management command to cleanup these expired sessions!
精彩评论