Consider the models:
class Author(models.Model):
name = models.CharField(max_length=200, unique=True)
class Book(models.Model):
pub_date = models.DateTimeField()
author = models.ForeignKey(Author)
Now suppose I want to order all the books by, say, their pub_date. I would use order_by('pub_date')
. But what if I wa开发者_如何学运维nt a list of all authors ordered according to who most recently published books?
It's really very simple when you think about it. It's essentially:
- The author on top is the one who most recently published a book
- The next one is the one who published books not as new as the first,
- So on etc.
I could probably hack something together, but since this could grow big, I need to know that I'm doing it right.
Help appreciated!
Edit: Lastly, would the option of just adding a new field to each one to show the date of the last book and just updating that the whole time be better?
from django.db.models import Max
Author.objects.annotate(max_pub_date=Max('books__pub_date')).order_by('-max_pub_date')
this requires that you use django 1.1
and i assumed you will add a 'related_name' to your author field in Book model, so it will be called by Author.books instead of Author.book_set. its much more readable.
Or, you could play around with something like this:
Author.objects.filter(book__pub_date__isnull=False).order_by('-book__pub_date')
Lastly, would the option of just adding a new field to each one to show the date of the last book and just updating that the whole time be better?
Actually it would! This is a normal denormalization practice and can be done like this:
class Author(models.Model):
name = models.CharField(max_length=200, unique=True)
latest_pub_date = models.DateTimeField(null=True, blank=True)
def update_pub_date(self):
try:
self.latest_pub_date = self.book_set.order_by('-pub_date')[0]
self.save()
except IndexError:
pass # no books yet!
class Book(models.Model):
pub_date = models.DateTimeField()
author = models.ForeignKey(Author)
def save(self, **kwargs):
super(Book, self).save(**kwargs)
self.author.update_pub_date()
def delete(self):
super(Book, self).delete()
self.author.update_pub_date()
This is the third common option you have besides two already suggested:
- doing it in SQL with a join and grouping
- getting all the books to Python side and remove duplicates
Both these options choose to compute pub_dates from a normalized data at the time when you read them. Denormalization does this computation for each author at the time when you write new data. The idea is that most web apps do reads most often than writes so this approach is preferable.
One of the perceived downsides of this is that basically you have the same data in different places and it requires you to keep it in sync. It horrifies database people to death usually :-). But this is usually not a problem until you use your ORM model to work with dat (which you probably do anyway). In Django it's the app that controls the database, not the other way around.
Another (more realistic) downside is that with the naive code that I've shown massive books update may be way slower since they ping authors for updating their data on each update no matter what. This is usually solved by having a flag to temporarily disable calling update_pub_date
and calling it manually afterwards. Basically, denormalized data requires more maintenance than normalized.
def remove_duplicates(seq):
seen = {}
result = []
for item in seq:
if item in seen: continue
seen[item] = 1
result.append(item)
return result
# Get the authors of the most recent books
query_result = Books.objects.order_by('pub_date').values('author')
# Strip the keys from the result set and remove duplicate authors
recent_authors = remove_duplicates(query_result.values())
Building on ayaz's solution, what about: Author.objects.filter(book__pub_date__isnull=False).distinct().order_by('-book__pub_date')
精彩评论