Can you index subqueries?

I have a table and a query that looks below. For a working example, see This SQL Fiddle .

SELECT o.property_B, SUM(o.score1), w.score FROM o INNER JOIN ( SELECT o.property_B, SUM(o.score2) AS score FROM o GROUP BY property_B ) w ON w.property_B = o.property_B WHERE o.property_A = 'specific_A' GROUP BY property_B; 

With my real data, this query takes 27 seconds. However, if I first create w as a temporary table and the index_B property, all together takes ~ 1 second.

 CREATE TEMPORARY TABLE w AS SELECT o.property_B, SUM(o.score2) AS score FROM o GROUP BY property_B; ALTER TABLE w ADD INDEX `property_B_idx` (property_B); SELECT o.property_B, SUM(o.score1), w.score FROM o INNER JOIN w ON w.property_B = o.property_B WHERE o.property_A = 'specific_A' GROUP BY property_B; DROP TABLE IF EXISTS w; 

Is there a way to combine the best of these two queries? That is, one query with the benefits of indexing speed in a subquery?

EDIT

After Mehran answers below, I read this explanation in the MySQL documentation :

Starting with MySQL 5.6.3, the optimizer more efficiently processes subqueries in the FROM clause (i.e., derived tables):

...

In cases where a subquery in the FROM clause requires materialization, the optimizer can speed up access to the result by adding an index to the materialized table. If such an index allows access to the table, it can significantly reduce the amount of data that must be read during query execution. Consider the following query:

 SELECT * FROM t1 JOIN (SELECT * FROM t2) AS derived_t2 ON t1.f1=derived_t2.f1; 

The optimizer builds the index over column f1 from the derived_t2, if this allows the use of ref access for the execution plan with the lowest cost. After adding an index, the optimizer can process the materialized view in the same way as a regular table with an index, and it also benefits from the generated index. The overhead of creating an index is negligible compared to the cost of executing a query without an index. If access for access results in a higher cost than any other access method, no index is created and the optimizer loses nothing.

+5
source share
4 answers

First of all, you need to know that creating a temporary table is an absolutely possible solution. But in cases where another choice is not applicable, what is wrong here!

In your case, you can easily increase your query, as FrankPl pointed out, because your subquery and main query are grouped by one field. Therefore, you do not need any subqueries. I am going to copy and paste the FrankPl solution for completeness:

 SELECT o.property_B, SUM(o.score1), SUM(o.score2) FROM o GROUP BY property_B; 

However, this does not mean that it is not possible to find a scenario in which you want to index the subquery. In which cases you have two options, first use the temporary table, as you indicated yourself, holding the results of the subquery. This solution is beneficial because it has been maintained by MySQL for a long time. This is simply not possible if there is a huge amount of data.

The second solution uses MySQL version 5.6 or higher . In the latest versions of MySQL, new algorithms are built in, so the index defined in the table used in the subquery can also be used outside the subquery.

[UPDATE]

For the edited version of the question, I would recommend the following solution:

 SELECT o.property_B, SUM(IF(o.property_A = 'specific_A', o.score1, 0)), SUM(o.score2) FROM o GROUP BY property_B HAVING SUM(IF(o.property_A = 'specific_A', o.score1, 0)) > 0; 

But you need to work with the HAVING part. You may need to change it to suit your real problem.

+2
source

There must be MySQL to optimize your query, and I don’t think there is a way to create an index on the fly. However, you can try to force use the property_o index (if you have one). See http://dev.mysql.com/doc/refman/5.1/en/index-hints.html

In addition, you can combine create and alter statements if you wish.

+1
source

I do not understand why you need a connection at all. I would suggest that

 SELECT o.property_B, SUM(o.score1), SUM(o.score2) FROM o GROUP BY property_B; 

should give what you want, but with a much simpler and therefore better optimized statement.

+1
source

I am not very familiar with MySql, I mainly worked with Oracle. If you need a where clause in SUM, you can use decoding or case. it will look something like this.

 SELECT o.property_B, , SUM(decode(property_A, 'specific_A', o.score1, 0), SUM(o.score2) FROM o GROUP BY property_B; 

or with flag

 SELECT o.property_B, , SUM(CASE WHEN property_A = 'specific_A' THEN o.score1 ELSE 0 END ), SUM(o.score2) FROM o GROUP BY property_B; 
+1
source

Source: https://habr.com/ru/post/1207310/


All Articles