开发者

How do I sort using two fields?

开发者 https://www.devze.com 2023-02-06 09:27 出处:网络
I have a sorting/grouping issue that I\'m hoping somebody could add some insight on. We have a table of stories that have a publish date and an updated date.I\'m using Django so it looks like this:

I have a sorting/grouping issue that I'm hoping somebody could add some insight on.

We have a table of stories that have a publish date and an updated date. I'm using Django so it looks like this:

class Story(models.Model):
    pub_date = models.DateTimeField(db_index=True)
    update_date = models.DateTimeField(blank=True, null=True, db_index=True)
    headline = models.CharField(max_length=200)
    ...

We want to display the stories on a paginated page grouped by day. So...

Jan 20
    Story 1
    Story 2

Jan 19
    Story 1
    Story 3

The challenge is that if a story has an update_date it should be displayed twice, once on the pub_date day, and once on the update_day date (e.g. Story 1).

There are 10s of thousands of stories so I can't do it all in python of course, but I don't know of a way to do this query in SQL.

开发者_JAVA百科

What I have right now is sorting everything by -pub_date and then getting a range of the max and min dates on a given page. I then query for any stories between those dates with an update_date and combine and group them in python. The problem is that the number of items on a page is irregular then.

So I guess my question is this: What is the best way to query a table for a list of items and sort them based on two fields, duplicating an item in the query if it has a value in the second field, and then sorting based on the two fields?

Hope that makes sense...


i can only think of "union" being able to do this.

here's an example of what that would look like. not sure how fast or good it is for the database to have this type of query sent to it often though D:

the query assumes your table name is stories, and uses the columns headline, pub_date and update_date. it also assumes that a story that hasn't been updated has the value null in the update_date column.

SELECT      headline,
            the_date,
            DAY(the_date) AS the_day
FROM (
    SELECT      headline,
                pub_date AS the_date
    FROM        stories
    UNION
    SELECT      headline,
                update_date AS the_date
    FROM        stories
    WHERE       update_date IS NOT NULL
) AS publishedandupdated
ORDER BY    the_date DESC;

if you want to add a limit to the query, it should be done last, after the "order by" clause.


your question is similar to what I had. I read some items from Facebook walls. I had two dates, one on item creation(user posts the item), one on item retrieval(I read the item from Facebook). I wanted to show items that are posted or retrieved today.

SELECT link,time FROM homeWallItems WHERE 
DATE_SUB(CURDATE(),INTERVAL 1 DAY)<= created 
OR
DATE_SUB(CURDATE(),INTERVAL 1 DAY)<= time
group by time LIMIT 0,30

Edit: I was over optimistic in this sentence: It is wrong.

in this code, instead of CURDATE(), if you use time, then it should work you.


Making some assumptions on the column names, you need UNION ALL to retain duplicates from both parts.

    select headline, actualdate=pub_date
    from story
    where pub_date between /mindate/ and /maxdate/
union all
    select headline, actualdate=update_date
    from story
    where update_date between /mindate/ and /maxdate/
order by actualdate
  • The virtual field actualdate is used to match up the pub_date / update_date as a single column on which to ORDER BY.
  • The ORDER BY in a union-ed statement is applied AFTER the union has been done, so it only needs to appear once.
  • the filter on the date range is applied within each part of the union, to reduce the worktable size (it shouldn't have to pull in all data unnecessarily before applying the filter)
0

精彩评论

暂无评论...
验证码 换一张
取 消