How to sort using two fields?

I have a sorting / grouping problem, and I hope someone can add some information.

We have a table of stories in which there is a publication date and an updated date. I use Django to make it look like this:

class Story(models.Model): pub_date = models.DateTimeField(db_index=True) update_date = models.DateTimeField(blank=True, null=True, db_index=True) headline = models.CharField(max_length=200) ... 

We want to display stories on a page by page, grouped by day. So that...

 Jan 20 Story 1 Story 2 Jan 19 Story 1 Story 3 

The problem is that if the story has update_date, it should be displayed twice, once a day pub_date and once on the date update_day (for example, Story 1).

There are 10 thousand thousand stories, so of course I can’t do all this in python, but I don’t know how to do it in SQL.

Now I have to sort everything on -pub_date, and then get the range of maximum and minimum dates on this page. Then I query any stories between these dates with update_date and merge and group them in python. The problem is that the number of elements on the page is irregular.

So, I think my question is this: what is the best way to query a table for a list of elements and sort them based on two fields, duplicate an element in a query if it matters in the second field and then sort based on two fields?

Hope this makes sense ...

+4
source share
3 answers

I can only think that β€œunion” can do this.

here is an example of how it would look. not sure how fast or good it is for the database to frequently query this type of query, although D:

the query assumes your history table name and uses the column headings pub_date and update_date . he also assumes that a story that has not been updated is null in the update_date column.

 SELECT headline, the_date, DAY(the_date) AS the_day FROM ( SELECT headline, pub_date AS the_date FROM stories UNION SELECT headline, update_date AS the_date FROM stories WHERE update_date IS NOT NULL ) AS publishedandupdated ORDER BY the_date DESC; 

if you want to add a constraint to the request, this should be done last after the "order by" clause.

+3
source

Your question is similar to what I had. I read some items from the walls of Facebook. I had two dates: one to create the item (the user sends the item), one to search for the item (I read the item from Facebook). I wanted to show items published or received today.

 SELECT link,time FROM homeWallItems WHERE DATE_SUB(CURDATE(),INTERVAL 1 DAY)<= created OR DATE_SUB(CURDATE(),INTERVAL 1 DAY)<= time group by time LIMIT 0,30 

Edit: I was optimistic about this sentence: this is wrong.

in this code instead of CURDATE (), if you use time , then it should work for you.

0
source

Fulfilling some assumptions about column names, you need UNION ALL to keep duplicates from both parts.

  select headline, actualdate=pub_date from story where pub_date between /mindate/ and /maxdate/ union all select headline, actualdate=update_date from story where update_date between /mindate/ and /maxdate/ order by actualdate 
  • The actual value of the virtual field is used to map pub_date / update_date as a single column on which ORDER BY.
  • ORDER BY in the union-ed statement applies AFTER the join is complete, so it should only appear once.
  • a filter in the date range is applied within each part of the join to reduce the size of the worksheet (it will not need to enter all the data unnecessarily before applying the filter)
0
source

Source: https://habr.com/ru/post/1336393/


All Articles