For a number of reasons^, I'd like to use a U开发者_运维问答UID as a primary key in some of my Django models. If I do so, will I still be able to use outside apps like "contrib.comments", "django-voting" or "django-tagging" which use generic relations via ContentType?
Using "django-voting" as an example, the Vote model looks like this:
class Vote(models.Model):
user = models.ForeignKey(User)
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
object = generic.GenericForeignKey('content_type', 'object_id')
vote = models.SmallIntegerField(choices=SCORES)
This app seems to be assuming that the primary key for the model being voted on is an integer.
The built-in comments app seems to be capable of handling non-integer PKs, though:
class BaseCommentAbstractModel(models.Model):
content_type = models.ForeignKey(ContentType,
verbose_name=_('content type'),
related_name="content_type_set_for_%(class)s")
object_pk = models.TextField(_('object ID'))
content_object = generic.GenericForeignKey(ct_field="content_type", fk_field="object_pk")
Is this "integer-PK-assumed" problem a common situation for third-party apps which would make using UUIDs a pain? Or, possibly, am I misreading this situation?
Is there a way to use UUIDs as primary keys in Django without causing too much trouble?
^ Some of the reasons: hiding object counts, preventing url "id crawling", using multiple servers to create non-conflicting objects, ...
As seen in the documentation, from Django 1.8 there is a built in UUID field. The performance differences when using a UUID vs integer are negligible.
import uuid
from django.db import models
class MyUUIDModel(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
You can also check this answer for more information.
A UUID primary key will cause problems not only with generic relations, but with efficiency in general: every foreign key will be significantly more expensive—both to store, and to join on—than a machine word.
However, nothing requires the UUID to be the primary key: just make it a secondary key, by supplementing your model with a uuid field with unique=True
. Use the implicit primary key as normal (internal to your system), and use the UUID as your external identifier.
The real problem with UUID as a PK is the disk fragmentation and insert degradation associated with non-numeric identiifers. Because the PK is a clustered index (in virtually every RDBMS except PostgreSQL), when it's not auto-incremented, your DB engine will have to resort your physical drive when inserting a row with an id of lower ordinality, which will happen all the time with UUIDs. When you get lots of data in your DB, it may take many seconds or even minutes just to insert one new record. And your disk will eventually become fragmented, requiring periodic disk defragmentation. This is all really bad.
To solve for these, I recently came up with the following architecture that I thought would be worth sharing.
The UUID Pseudo-Primary-Key
This method allows you to leverage the benefits of a UUID as a Primary Key (using a unique index UUID), while maintaining an auto-incremented PK to address the fragmentation and insert performance degredation concerns of having a non-numeric PK.
How it works:
- Create an auto-incremented primary key called
pkid
on your DB Models. - Add a unique-indexed UUID
id
field to allow you to search by a UUID id, instead of a numeric primary key. - Point the ForeignKey to the UUID (using
to_field='id'
) to allow your foreign-keys to properly represent the Pseudo-PK instead of the numeric ID.
Essentially, you will do the following:
First, create an abstract Django Base Model
class UUIDModel(models.Model):
pkid = models.BigAutoField(primary_key=True, editable=False)
id = models.UUIDField(default=uuid.uuid4, editable=False, unique=True)
class Meta:
abstract = True
Make sure to extend the base model instead of models.Model
class Site(UUIDModel):
name = models.CharField(max_length=255)
Also make sure your ForeignKeys point to the UUID id
field instead of the auto-incremented pkid
field:
class Page(UUIDModel):
site = models.ForeignKey(Site, to_field='id', on_delete=models.CASCADE)
If you're using Django Rest Framework (DRF), make sure to also create a Base ViewSet class to set the default search field:
class UUIDModelViewSet(viewsets.ModelViewSet):
lookup_field = 'id'
And extend that instead of the base ModelViewSet for your API views:
class SiteViewSet(UUIDModelViewSet):
model = Site
class PageViewSet(UUIDModelViewSet):
model = Page
More notes on the why and the how in this article: https://www.stevenmoseley.com/blog/uuid-primary-keys-django-rest-framework-2-steps
I ran into a similar situation and found out in the official Django documentation, that the object_id
doesn't have to be of the same type as the primary_key of the related model. For example, if you want your generic relationship to be valid for both IntegerField and CharField id's, just set your object_id
to be a CharField. Since integers can coerce into strings it'll be fine. Same goes for UUIDField.
Example:
class Vote(models.Model):
user = models.ForeignKey(User)
content_type = models.ForeignKey(ContentType)
object_id = models.CharField(max_length=50) # <<-- This line was modified
object = generic.GenericForeignKey('content_type', 'object_id')
vote = models.SmallIntegerField(choices=SCORES)
this can be done by using a custom base abstract model,using the following steps.
First create a folder in your project call it basemodel then add a abstractmodelbase.py with the following below:
from django.db import models
import uuid
class BaseAbstractModel(models.Model):
"""
This model defines base models that implements common fields like:
created_at
updated_at
is_deleted
"""
id = models.UUIDField(primary_key=True, unique=True, default=uuid.uuid4, editable=False)
created_at = models.DateTimeField(auto_now_add=True, editable=False)
updated_at = models.DateTimeField(auto_now=True, editable=False)
is_deleted = models.BooleanField(default=False)
def soft_delete(self):
"""soft delete a model instance"""
self.is_deleted=True
self.save()
class Meta:
abstract = True
ordering = ['-created_at']
second: in all your model file for each app do this
from django.db import models
from basemodel import BaseAbstractModel
import uuid
# Create your models here.
class Incident(BaseAbstractModel):
""" Incident model """
place = models.CharField(max_length=50, blank=False, null=False)
personal_number = models.CharField(max_length=12, blank=False, null=False)
description = models.TextField(max_length=500, blank=False, null=False)
action = models.TextField(max_length=500, blank=True, null=True)
image = models.ImageField(upload_to='images/', blank=True, null=True)
incident_date = models.DateTimeField(blank=False, null=False)
So the above model incident inherent all the field in baseabstract model.
The question can be rephrased as "is there a way to get Django to use a UUID for all database ids in all tables instead of an auto-incremented integer?".
Sure, I can do:
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
in all of my tables, but I can't find a way to do this for:
- 3rd party modules
- Django generated ManyToMany tables
So, this appears to be a missing Django feature.
精彩评论