SQL distance query optimization

I am running a MySQL query that returns results based on location. However, I recently noticed that this really slows down my PHP application. I used CodeIgniter, and the profiler shows a request containing 4.2 seconds. The Geoname table has 500,000 rows. I have some indexes in key columns, how else can I speed up this query?

Here is my SQL:

SELECT `products`.`product_name` , `geoname`.`geonameid` , `geoname`.`latitude` , `geoname`.`longitude` , `products`.`product_id` , AVG(ratings.vote) as rating , count(comments.comment_id) as total_comments , (6371 * acos(cos(radians(38.7666667)) * cos(radians(geoname.latitude)) * cos(radians(geoname.longitude) - radians(-3.3833333)) + sin(radians(38.7666667)) * sin(radians(geoname.latitude))) ) AS distance FROM (`foods`) JOIN `geoname` ON `geoname`.`geonameid` = `products`.`geoname_id` LEFT JOIN `ratings` ON `ratings`.`var_id` = `products`.`product_id` LEFT JOIN `comments` ON `comments`.`var_id` = `products `.`product_id` WHERE `products`.`product_id` != 82 GROUP BY `products`.`product_id` HAVING `distance` < 99 ORDER BY `distance` LIMIT 10 
+4
source share
4 answers

Let's start with the cos query itself (radians (geoname.latitude)) and other functions seem invariant, so we can do a little preprocessing and save the calculated values ​​in a table. (Calculation of trigger functions is mainly due to the use of an expensive extension of the series).

6371 * acos (cos (radians (38.7666667)) - this is equal to radians (38.76667) * 6371, so why are we standing there.

Secondly, if you do not care, then what about accuracy. You can provide the radians themselves to say 10,000 points from 0 to pi / 2, which should give a good approximation, up to 4 decimal numbers, for example, less than a kilometer

 (6371 * acos(cos(radians(38.7666667)) * cos(radians(geoname.latitude)) * cos(radians(geoname.longitude) - radians(-3.3833333)) + sin(radians(38.7666667)) * sin(radians(geoname.latitude)))) 

also remember that sin (a) when a> pi / 2 and a <pi is equal to sin (pi - a) when a> pi and a <3/2 pi is equal to -sin (a-pi) and when a> 3 / 2 pi and a <2pi is equal to -sin (2pi-a). Similar functions can be performed for the cos function.

Try it and see if it helps. Luke

+3
source

If you ask MySQL about EXPLAIN PLAN, I think you will find that calculating distances makes your indexes useless. You force the query engine to execute the SCAN TABLE.

The only way to save the situation is to place the distance in a separate column and index it.

0
source

If you can get closer to any search location, say 1000 per 10000 points in space, you could actually store the distances in the auxiliary table, line by line:

 create table distance ( position1_id int, position2_id int, distance int -- probably precise enough ) 

with an index on position1_id and distance. The table would have anywhere from 10 ^ 6 to 10 ^ 8 rows, but using index data, I think you could quickly get the closest position to 2_id. Even if this is not enough for you (due to the fact that you have to decide a limited resolution), it will allow you to quickly eliminate, possibly> 99% of the places that you do not need in a particular case.

0
source

You can disable the radians () function by simply dividing it by 57.29577951. This will eliminate six math calculations per line. The formula is generally not friendly for relational queries on large sets. However, another request arises here, which tries to squeeze the views before joining. I'm not sure if it will work faster or slower without testing and tuning. Ultimately, I would decide to build a statistics table on the primary key and configure triggers for other tables to support it, so that your final distance calculation request will be executed instantly on a very small table. And to be truly amazing, I would build an audit table similar to a statistics table to summarize trends.

 select p.product_name, g.geonameid, g.latitude, g.longitude, p.product_id, avg(r.votes) as rating, c.total_comments, g.distance (select product_id, geoname_id, product_name from products where product_id != 82) p inner join (select geonameid, latitude, longitude, (6371 * acos(cos(38.7666667/57.29577951) * cos(latitude/57.29577951) * cos((longitude/57.29577951) - (-3.3833333/57.29577951)) + sin(38.7666667/57.29577951) * sin(latitude/57.29577951)) ) AS distance from geoname group by geonameid having distance < 99) g on p.geoname_id = g.geonameid left join (select var_id, count(vote) votes from ratings group by var_id) r on p.product_id = r.var_id left join (select var_id, count(comment_id) total_comments from comments group by var_id) c on p.product_id = c.var_id group by p.product_id order by g.distance limit 10 
0
source

Source: https://habr.com/ru/post/1304055/


All Articles