Recommendations Needed with a Mysql Query, Sorting, Grouping, and Performance

I have a "message" table where users send and receive messages, quite straightforwardly. What I would like to do is: retrieve DISTINCT sender_ids WHERE receiver_id is X, and sort it so that users from recipient X have unread messages, and users after recipient X has read the messages will appear after and everything will be sort by created_at DESC.

Any ideas how I can do this? Note. Performance is also a problem.

This is a query that I used, but it looks like the sorting is not really doing right, maybe DISTINCT is screwing things up? I expect a result of 6, 5, 4, 2, 3 - but I get 6, 5, 4, 3, 2

SELECT DISTINCT sender_id FROM message m WHERE receiver_id = 1 ORDER BY read_at, created_at DESC 

Here is a table with sample data:

 CREATE TABLE `message` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `sender_id` bigint(20) NOT NULL, `receiver_id` bigint(20) NOT NULL, `message` text, `read_at` datetime DEFAULT NULL, `created_at` datetime DEFAULT NULL, PRIMARY KEY (`id`), KEY `sender` (`sender_id`), KEY `receiver` (`receiver_id`), KEY `dates` (`receiver_id`,`read_at`,`created_at`) ) ENGINE=MyISAM AUTO_INCREMENT=13 DEFAULT CHARSET=latin1; INSERT INTO `message` (id, sender_id, receiver_id, message, read_at, created_at) VALUES (1,2,1,NULL,'2011-01-01 01:01:01','2011-01-01 01:01:01'), (2,1,2,NULL,'2011-01-01 01:01:01','2011-01-01 01:01:02'), (3,2,1,NULL,'2011-01-01 01:01:01','2011-01-01 01:01:03'), (4,3,1,NULL,'2011-01-01 01:01:01','2011-01-01 01:01:04'), (5,3,1,NULL,'2011-01-01 01:01:01','2011-01-01 01:01:05'), (6,1,4,NULL,'2011-01-01 01:01:01','2011-01-01 01:01:06'), (7,4,1,NULL,NULL,'2011-01-01 01:01:07'), (8,5,1,NULL,NULL,'2011-01-01 01:01:08'), (9,5,1,NULL,NULL,'2011-01-01 01:01:09'), (10,1,6,NULL,NULL,'2011-01-01 01:01:10'), (11,6,1,NULL,NULL,'2011-01-01 01:01:11'); 
+4
source share
4 answers

The following returns the desired result according to the sample:

 SELECT sender_id FROM message AS m WHERE receiver_id=? GROUP BY sender_id ORDER BY COUNT(*)=COUNT(read_at), MAX(created_at) DESC; 

If you want to use the oldest message when sorting by created_at , change MAX to MIN .

COUNT(read_at) ignores zeros, while COUNT(*) does not, so the two will be unequal if there are unread messages. If there are not so many messages with the destination receiver, it should run quite quickly (this will help the receiver_id index). Profile your query before making an optimization decision.

With a few tweaks, you can make the Scrum Meister aggregate expression work. Try MIN(IF(read_at IS NULL, 0, 1)) replace COUNT(*)=COUNT(read_at) . I don’t think it will improve the runtime, but there will be at least a small chance (like most of the optimization, it depends on the internal components of MySQL).

EXPLAIN result in test pattern:

  + ---- + ------------- + ------- + ------ + --------------- - + ---------- + --------- + ------- + ------ + ------------ ---------------------------------- +
 |  id |  select_type |  table |  type |  possible_keys |  key |  key_len |  ref |  rows |  Extra |
 + ---- + ------------- + ------- + ------ + --------------- - + ---------- + --------- + ------- + ------ + ------------ ---------------------------------- +
 |  1 |  SIMPLE |  m |  ref |  receiver, dates |  receiver |  8 |  const |  7 |  Using where;  Using temporary;  Using filesort |
 + ---- + ------------- + ------- + ------ + --------------- - + ---------- + --------- + ------- + ------ + ------------ ---------------------------------- +

Getting rid of aggregate functions applied to message strings:

 SELECT sender_id FROM ( (SELECT sender_id, 0 AS all_read, MAX(created_at) AS recent FROM message AS m WHERE receiver_id=:receiver AND read_at IS NULL GROUP BY sender_id) UNION (SELECT sender_id, 1 AS all_read, MAX(created_at) AS recent FROM message AS m WHERE receiver_id=:receiver AND read_at IS NOT NULL GROUP BY sender_id) ) AS t GROUP BY sender_id ORDER BY MIN(all_read), recent DESC; 

watch to lose the earth. This query works using constant values ​​(separate queries are allowed for this) for a column indicating whether any messages from the sender are unread, not aggregated expressions. Here's the EXPLAIN output for this query:

  + ---- + -------------- + ------------ + ------- + -------- -------- + ------- + --------- + ------ + ------ + --------- ------------------------------------- +
 |  id |  select_type |  table |  type |  possible_keys |  key |  key_len |  ref |  rows |  Extra |
 + ---- + -------------- + ------------ + ------- + -------- -------- + ------- + --------- + ------ + ------ + --------- ------------------------------------- +
 |  1 |  PRIMARY |  <derived2> |  ALL |  NULL |  NULL |  NULL |  NULL |  5 |  Using temporary;  Using filesort |
 |  2 |  DERIVED |  m |  ref |  receiver, dates |  dates |  17 |  |  4 |  Using where;  Using temporary;  Using filesort |
 |  3 |  UNION |  m |  range |  receiver, dates |  dates |  17 |  NULL |  3 |  Using where;  Using temporary;  Using filesort |
 | NULL |  UNION RESULT |  <union2,3> |  ALL |  NULL |  NULL |  NULL |  NULL |  NULL |  |
 + ---- + -------------- + ------------ + ------- + -------- -------- + ------- + --------- + ------ + ------ + --------- ------------------------------------- + 
0
source

How about GROUP BY :

 SELECT sender_id FROM message m WHERE receiver_id = 1 GROUP BY sender_id ORDER BY MAX(IFNULL(read_at,'9999-01-01')) DESC 
+1
source

First optimize the small table the way I should do this:

 create table messages ( message_id bigint unsigned not null auto_increment primary key, sender_id begint unsigned not null, receiver_id bigint unsigned not null, read_at datetime default null, created_at datetime ) engine=innodb; create table message_body ( message_id bigint unsigned not null, message varchar(32000) not null ) engine=innodb; 

I use varchar instead of text, because when you have a small message in a text field, you will have 2 bytes. And sometimes the message will have less than 255 characters, so u will only store 1 byte instead of 2. see here .

Thus, there is not much weight to load a row if your messages are not in the same table. and if you are going to get A LOTS off data, it will be very useful!

My u ask for request would look like this:

 select distinct(sender_id) from messages where receiver_id = x group by sender_id order by read_at desc 
0
source

I really do not understand that "everything is sorted by the created_at desc command."

If unread messages should be displayed first, you cannot sort "all" using create_at.

But if you want to first list all unread messages (sorted by created_at), then list all messages read (sorted by created_at again), then the following will do the following:

  SELECT *
 FROM message m
 WHERE receiver_id = 1
 ORDER BY 
     CASE 
       WHEN read_at IS NULL THEN 0
       ELSE 1
     END ASC,
     created_at DESC;

This generates a slightly different order than you expect, but looking at the data samples, I think it should be correct.

0
source

Source: https://habr.com/ru/post/1334174/


All Articles