开发者

Word count query in Django

开发者 https://www.devze.com 2023-01-25 20:53 出处:网络
Given a model with both Boolean and TextField fields, I want to do a query that finds records that ma开发者_JAVA技巧tch some criteria AND have more than \"n\" words in the TextField. Is this possible?

Given a model with both Boolean and TextField fields, I want to do a query that finds records that ma开发者_JAVA技巧tch some criteria AND have more than "n" words in the TextField. Is this possible? e..g.:

class Item(models.Model):

    ...    
    notes = models.TextField(blank=True,)
    has_media = models.BooleanField(default=False)
    completed = models.BooleanField(default=False)
    ...

This is easy:

items = Item.objects.filter(completed=True,has_media=True)

but how can I filter for a subset of those records where the "notes" field has more than, say, 25 words?


Try this:

Item.objects.extra(where=["LENGTH(notes) - LENGTH(REPLACE(notes, ' ', ''))+1 > %s"], params=[25])

This code uses Django's extra queryset method to add a custom WHERE clause. The calculation in the WHERE clause basically counts the occurances of the "space" character, assuming that all words are prefixed by exactly one space character. Adding one to the result accounts for the first word.

Of course, this calculation is only an approximation to the real word count, so if it has to be precise, I'd do the word count in Python.


I dont know what SQL need to be run in order for the DB to do the work, which is really what we want, but you can monkey-patch it.

Make an extra fields named wordcount or something, then extend the save method and make it count all the words in notes before saving the model.

The it is trivial to loop over and there is still no chance that this denormalization of data will break since the save method is always run on save.

But there might be a better way, but if all else fails, this is what I would do.

0

精彩评论

暂无评论...
验证码 换一张
取 消