Slow MYSQL query with subqueries using COUNT

Right, I have no idea why, but this query takes more than 6 seconds to execute, the index is configured correctly, and if I run each query separately, it works fine with less than 0.5 seconds to complete.

Here is the request

SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, (SELECT COUNT(*) FROM supplier_questions q1 WHERE c.supplier_id = q1.supplier_id AND q1.incomplete = '0') AS questions, IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, (SELECT COUNT(*) FROM supplier_questions q2 WHERE c.supplier_id = q2.supplier_id AND q2.reviewed = '1') AS reviewed, questapproved, ss.supplier_no AS supplier_no FROM suppliers c INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id WHERE c.supplier_id != '0' AND ss.site_id = '2' GROUP BY c.supplier_id ORDER BY c.supplier_name ASC LIMIT 0, 20 

Explain query results are as follows

 id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY ss ref site_id,supplier_id site_id 4 const 1287 Using where; Using temporary; Using filesort 1 PRIMARY c eq_ref PRIMARY PRIMARY 4 ss.supplier_id 1 3 DEPENDENT SUBQUERY q2 ref supplier_id,reviewed reviewed 4 const 263 Using where 2 DEPENDENT SUBQUERY q1 ref supplier_id,incomplete incomplete 4 const 254 Using where 

The reason there are countable queries, because I need to know the number of rows from these tables, this cannot be done in another query, since the results should also be sorted by these values: (

+6
source share
4 answers

Like a punch in the dark, does it work faster? (I do not have mysql to check the syntax, so goodbye to any small errors, but you can get this idea)

 SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, questions, reviewed IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, questapproved, ss.supplier_no AS supplier_no FROM suppliers c INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id inner join (SELECT supplier_id, sum(if(incomplete='0',1,0)) as questions, sum(if(incomplete='1',1,0)) as reviewed FROM supplier_questions q1 group by supplier_id) as tmp on c.supplier_id = tmp.supplier_id WHERE c.supplier_id != '0' AND ss.site_id = '2' GROUP BY c.supplier_id ORDER BY c.supplier_name ASC LIMIT 0, 20 
+2
source
 FROM suppliers c INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id WHERE c.supplier_id != '0' AND ss.site_id = '2' GROUP BY c.supplier_id ORDER BY c.supplier_name ASC 

Since auto-generated primary keys are never equal to 0 (if the design error is big db), you can refuse the c.supplier_id! = '0' clause.

ss.site_id = '2' must be in JOIN state for readability.

It looks like this should only match one row in the vend_site table for the vendor (if this is your usual thing-1-N-thing relationship, that is, you select the second address of each vendor, maybe “2” matches “billing address” or something- then), therefore GROUP BY c.supplier_id is useless.If GROUP BY is actually doing something, the query is incorrect, because the "address" columns that are supposedly coming from the supplier_site table will come from a random row.

So, here's a simplified OT (WHERE gone):

 FROM suppliers c INNER JOIN supplier_site ss ON (c.supplier_id = ss.supplier_id AND ss.site_id = '2') ORDER BY c.supplier_name ASC 

I assume you have an index on c.supplier_name, so this part of the query should be very fast.

Now try this query:

 SELECT a.*, questapproved, ss.supplier_no AS supplier_no, IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, sum( q.incomplete = '0') AS questions, sum( q.reviewed = '1' ) AS reviewed FROM ( SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name FROM suppliers c INNER JOIN supplier_site ss ON (c.supplier_id = ss.supplier_id AND ss.site_id = '2') ORDER BY c.supplier_name ASC LIMIT 0, 20 ) a LEFT JOIN supplier_questions q ON (q.supplier_id = c.supplier_id) GROUP BY c.supplier_id ORDER BY c.supplier_name; 
+2
source

If you remove the subsamples, you will get something like this:

 SELECT c.supplier_id, supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, COUNT(IF (q1.incomplete = '0', '0', null)) AS questions, IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, COUNT(IF (q1.reviewed = '1', '1', null)) AS reviewed, questapproved, ss.supplier_no AS supplier_no FROM suppliers c INNER JOIN supplier_site ss ON c.supplier_id = ss.supplier_id LEFT OUTER JOIN supplier_questions q1 ON c.supplier_id = q1.supplier_id WHERE c.supplier_id != '0' AND ss.site_id = '2' GROUP BY c.supplier_id ORDER BY c.supplier_name ASC LIMIT 0, 20 

I do not have a MySQL database, so there may be errors in my SQL. The idea is to remove the subqueries and replace them with an outer join and use IF to read only the relevant rows.

+1
source

At first, I would try to restructure by pre-requesting the aggregate by the supplier according to the number of questions and considering ONCE. Then join the rest of the details. Using the keyword STRAIGHT_JOIN, it should be processed in the order shown. This will pre-aggregate first and use THAT as the basis for joining suppliers and then to supplier sites. No external group is required because it is based on a vendor identifier. However, joining the site_s Provider (your ss.supplier_no) implies that the provider has more than one place. Does this mean that the address and active state columns are taken from this table?

Should questions be associated with a specific provider and the appropriate location of the site or not?

In addition, since prequery has a WHERE clause on vend_id! = '0', it does not need a downstream because it will be the basis of a normal join with other tables, which excludes them from the result set.

 SELECT STRAIGHT_JOIN PreAggregate.supplier_id, PreAggregate.supplier_name, address1, address2, address3, address4, suppliertype, postcode, contact_name, PreAggregate.Questions, IF (active=1,'Yes', IF (active=2, 'NCR Only','Inactive')) AS rated, PreAggregate.Reviewed, questapproved, ss.supplier_no AS supplier_no FROM (select s1.Supplier_ID, s1.Supplier_Name, SUM( IF( q1.Incomplete = '0', 1, 0 )) Questions, SUM( IF( q1.Reviewed = '1', 1, 0 )) Reviewed from suppliers s1 join supplier_questions q1 ON s1.supplier_id = q1.supplier_id where s1.supplier_id != '0' group by s1.Supplier_ID ORDER BY s1.supplier_name ASC ) PreAggregate JOIN suppliers c ON PreAggregate.Supplier_ID = c.Supplier_ID JOIN supplier_site ss ON PreAggregate.Supplier_ID = ss.supplier_id AND ss.Site_ID = '2' LIMIT 0, 20 
0
source

Source: https://habr.com/ru/post/888895/


All Articles