Using NULL Order in UNION

I have a query (see below) that I have a custom UDF that is used to calculate whether certain points are inside the polygon (first query in UNION) or circular (second query in UNION).

select e.inquiry_match_type_id , a.geo_boundary_id , GeoBoundaryContains(c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, a.lat, a.lon, a.geo_boundary_vertex_id ) in_out , e.inquiry_id , e.external_id , COALESCE(f.inquiry_device_id,0) inquiry_device_id , b.external_info1 , b.external_info2 , b.geo_boundary_id , b.geo_boundary_type_id from geo_boundary_vertex a join geo_boundary b on b.geo_boundary_id = a.geo_boundary_id join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id join inquiry_mem e on e.inquiry_id = b.inquiry_id left outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 where d.trackpoint_index_id = 3127 and b.geo_boundary_type_id = 3 and e.expiration_date >= now() group by a.geo_boundary_id UNION select e.inquiry_match_type_id , b.geo_boundary_id , GeoBoundaryContains( c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, b.centroid_lat, b.centoid_lon, b.radius ) in_out , e.inquiry_id , e.external_id , COALESCE(f.inquiry_device_id,0) inquiry_device_id , b.external_info1 , b.external_info2 , b.geo_boundary_id , b.geo_boundary_type_id from geo_boundary b join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id join inquiry_mem e on e.inquiry_id = b.inquiry_id left outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 where d.trackpoint_index_id = 3127 and b.geo_boundary_type_id = 2 and e.expiration_date >= now() group by b.geo_boundary_id 

When I run the explanation for the request, I get the following:

  id select_type table type possible_keys key key_len ref rows Extra ------ -------------- ---------- ------- --------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------- ---------- ------------------------ ------- ------------------------------- 1 PRIMARY d const PRIMARY PRIMARY 4 const 1 Using temporary; Using filesort 1 PRIMARY c ref PRIMARY,fk_mtp_idx_geo_boundary_mtp_idx,fk_mtp_idx_geo_boundary_geo_boundary,fk_mtp_idx_geo_boundary_mtp_mem_idx,fk_mtp_idx_geo_boundary_geo_boundary_mem fk_mtp_idx_geo_boundary_mtp_idx 4 const 9 1 PRIMARY b eq_ref PRIMARY,fk_geo_boundary_inquiry,fk_geo_boundary_geo_boundary_type PRIMARY 4 gothim.c.geo_boundary_id 1 Using where 1 PRIMARY e eq_ref PRIMARY PRIMARY 4 gothim.b.inquiry_id 1 Using where 1 PRIMARY f ref fk_inquiry_device_mem_inquiry fk_inquiry_device_mem_inquiry 4 gothim.e.inquiry_id 2 1 PRIMARY a ref fk_geo_boundary_vertex_geo_boundary fk_geo_boundary_vertex_geo_boundary 4 gothim.b.geo_boundary_id 11 Using where 2 UNION d const PRIMARY PRIMARY 4 const 1 Using temporary; Using filesort 2 UNION c ref PRIMARY,fk_mtp_idx_geo_boundary_mtp_idx,fk_mtp_idx_geo_boundary_geo_boundary,fk_mtp_idx_geo_boundary_mtp_mem_idx,fk_mtp_idx_geo_boundary_geo_boundary_mem fk_mtp_idx_geo_boundary_mtp_idx 4 const 9 2 UNION b eq_ref PRIMARY,fk_geo_boundary_inquiry,fk_geo_boundary_geo_boundary_type PRIMARY 4 gothim.c.geo_boundary_id 1 Using where 2 UNION e eq_ref PRIMARY PRIMARY 4 gothim.b.inquiry_id 1 Using where 2 UNION f ref fk_inquiry_device_mem_inquiry fk_inquiry_device_mem_inquiry 4 gothim.e.inquiry_id 2 (null) UNION RESULT <union1,2> ALL (null) (null) (null) (null) (null) Using filesort 12 record(s) selected [Fetch MetaData: 1ms] [Fetch Data: 5ms] 

Now I can split the queries and use the ORDER BY NULL trick to get rid of filesort, however, when I try to add that until the end of UNION it does not work.

I am considering the possibility of splitting a query into 2 queries or perhaps rewriting it completely so as not to use UNION (although this, of course, is a bit more complicated). Another thing that I work against me is that we have this in production, and I would like to limit the changes - I would really like to add ORDER BY NULL to the end of the query and make it with it, but it does not work with UNION .

Any help would be greatly appreciated.

+4
source share
3 answers

Typically, ORDER BY can be used for individual queries inside a UNION as follows:

 ( SELECT * FROM table1, … GROUP BY id ORDER BY NULL ) UNION ALL ( SELECT * FROM table2, … GROUP BY id ORDER BY NULL ) 

However, like docs :

However, using ORDER BY for individual SELECT says nothing about the order in which rows are displayed in the final result, because UNION creates an unordered set of rows by default. Therefore, using ORDER BY in this context is usually associated with LIMIT , so it is used to define a subset of the selected rows to retrieve for SELECT , although this does not necessarily affect the order of these rows in the final UNION result. If ORDER BY appears without a LIMIT in a SELECT , it is optimized, since in any case it will have no effect.

This, of course, is a smart move, but not too smart, as they forgot to optimize GROUP BY ordering behavior as well.

So now you have to add a very high LIMIT to your individual queries:

 ( SELECT * FROM table1, … GROUP BY id ORDER BY NULL LIMIT 100000000 ) UNION ALL ( SELECT * FROM table2, … GROUP BY id ORDER BY NULL LIMIT 100000000 ) 

I will post it as an error in MySQL, I hope they fix it in the next version, but for now you can use this solution.

Please note that a similar solution (using TOP 100% ) was used to force the subqueries in SQL Server 2000 , but it stopped working in 2005 ( ORDER BY does not affect subqueries with TOP 100% for the optimizer).

It is safe to use it because it will not interrupt your requests, even if the optimizer behavior changes in future releases, but just makes them as slow as they are now.

+6
source

Maybe try something like

 SELECT * FROM ( [your entire query here] ) DerivedTable ORDER BY NULL 

I have never used MySQL, so forgive me if I miss the plot :)

EDIT: what if you run each individual query separately (which, as you say, works), but insert data into a temporary table. Then at the end just select temp from the table.

0
source

Have you tried changing UNION to UNION ALL ?

A UNION tries to remove duplicate lines. To do this, he will need to sort the intermediate results, which may explain what you see in your implementation plan.

From MySQL Union

By default, MySQL UNION removes all duplicate rows from the result set, even if you are not using explicit DISTINCT after the UNION keyword.

If you use UNION ALL explicitly, the result is duplicate lines to set. You only use this if you want to keep duplicate rows, or you are sure that there is no duplicate row in the result set.

Edit

I doubt it will make any difference (maybe even worse), but you can try to execute an "equivalent" query

 select * from ( select b.geo_boundary_id , GeoBoundaryContains( c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, b.centroid_lat, b.centoid_lon, b.radius ) in_out from geo_boundary b join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id where b.geo_boundary_type_id = 2 group by b.geo_boundary_id union all select a.geo_boundary_id , GeoBoundaryContains(c.tpi_geo_boundary_coverage_type_id, 29.287437, -95.055807, a.lat, a.lon, a.geo_boundary_vertex_id ) in_out from geo_boundary_vertex a join geo_boundary b on b.geo_boundary_id = a.geo_boundary_id join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id where b.geo_boundary_type_id = 3 group by a.geo_boundary_id ) s inner join ( select e.inquiry_match_type_id , e.inquiry_id , e.external_id , COALESCE(f.inquiry_device_id,0) inquiry_device_id , b.external_info1 , b.external_info2 , b.geo_boundary_id , b.geo_boundary_type_id from geo_boundary b join trackpoint_index_geo_boundary_mem c on c.geo_boundary_id = b.geo_boundary_id join trackpoint_index_mem d on d.trackpoint_index_id = c.trackpoint_index_id join inquiry_mem e on e.inquiry_id = b.inquiry_id left outer join inquiry_device_mem f on f.inquiry_id = e.inquiry_id and f.device_id = 3201 where d.trackpoint_index_id = 3127 and b.geo_boundary_type_id IN (2, 3) and e.expiration_date >= now() ) r on r.geo_boundary_id = s.geo_boundary_id 
0
source

Source: https://habr.com/ru/post/1345883/


All Articles