Using the WHERE clause to search for POIs over a range of distances from longitude and latitude

I use the following sql code to find out that the β€œALL” poi are closest to the given coordinates, but I would like to know the specific poi, not all of them. When I try to use the where clause, I get an error message and it doesn’t work, and I am stuck in this now, since I use only one table for all coordinates from all poi's.

SET @orig_lat=55.4058; SET @orig_lon=13.7907; SET @dist=10; SELECT *, 3956 * 2 * ASIN(SQRT(POWER(SIN((@orig_lat -abs(latitude)) * pi()/180 / 2), 2) + COS(@orig_lat * pi()/180 ) * COS(abs(latitude) * pi()/180) * POWER(SIN((@orig_lon - longitude) * pi()/180 / 2), 2) )) as distance FROM geo_kulplex.sweden_bobo HAVING distance < @dist ORDER BY distance limit 10; 
+4
source share
2 answers

The problem is that you cannot reference columns with an alias ( distance in this case) in a select or where clause. For example, you cannot do this:

 select a, b, a + b as NewCol, NewCol + 1 as AnotherCol from table where NewCol = 2 

This will happen in both cases: the select statement when trying to process NewCol + 1 , and also in the where statement when trying to process NewCol = 2 .

There are two ways to solve this problem:

1) Replace the reference to the calculated value itself. Example:

 select a, b, a + b as NewCol, a + b + 1 as AnotherCol from table where a + b = 2 

2) Use an external select statement:

 select a, b, NewCol, NewCol + 1 as AnotherCol from ( select a, b, a + b as NewCol from table ) as S where NewCol = 2 

Now, taking into account your HUGE and not very convenient calculation column for people :) I think that you need to go to the last option to increase readability:

 SET @orig_lat=55.4058; SET @orig_lon=13.7907; SET @dist=10; SELECT * FROM ( SELECT *, 3956 * 2 * ASIN(SQRT(POWER(SIN((@orig_lat -abs(latitude)) * pi()/180 / 2), 2) + COS(@orig_lat * pi()/180 ) * COS(abs(latitude) * pi()/180) * POWER(SIN((@orig_lon - longitude) * pi()/180 / 2), 2) )) as distance FROM geo_kulplex.sweden_bobo ) AS S WHERE distance < @dist ORDER BY distance limit 10; 

Edit: As indicated below, this will result in a full table scan. Depending on the amount of data that you will process, you can avoid this and move on to the first option, which should run faster.

+5
source

The reason you cannot use your alias in the WHERE is the order in which MySQL does things:

  • FROM
  • WHERE
  • GROUP BY
  • HAVING
  • SELECT
  • ORDER BY

When executing your WHERE value for your column alias has not yet been calculated. This is good because it would lose a ton of performance. Imagine many (1,000,000) rows β€” to use your calculations in the WHERE , each of these 1,000,000 must first be extracted and calculated, so the WHERE can compare the calculation results with your expectation.

You can do it explicitly with

  • using HAVING (that's why HAVING has a different name like WHERE is a different thing)
  • using a subquery, as shown in @MostyMostacho's article (will do the same with some overhead effectively)
  • enter complex calculation into the WHERE (it will effectively give the same performance result as HAVING )

All of them will work almost the same way: each line is selected first, the distance is calculated and, finally, filtered by distance before sending the result to the client.

You can get better (!) Better performance by mixing a simple WHERE to approximate distance (filtering rows for the first sample) with a more accurate Euclidean formula in the HAVING .

  • find strings that can match the @distance = 10 condition using the WHERE based on the prime distance of X and Y (the bounding box) is a cheap operation.
  • filtering these results using the Euclidean distance formula in the HAVING is an expensive operation.

Take a look at this query to understand what I mean:

 SET @orig_lat=55.4058; SET @orig_lon=13.7907; SET @dist=10; SELECT *, 3956 * 2 * ASIN(SQRT(POWER(SIN((@orig_lat -abs(latitude)) * pi()/180 / 2), 2) + COS(@orig_lat * pi()/180 ) * COS(abs(latitude) * pi()/180) * POWER(SIN((@orig_lon - longitude) * pi()/180 / 2), 2) )) as distance FROM geo_kulplex.sweden_bobo /* WHERE clause to pre-filter by distance approximation .. filter results later with precise euclidian calculation. can use indexes. */ WHERE /* i'm unsure about geo stuff ... i dont think you want a distance of 10Β° here, please adjust this properly!! */ latitude BETWEEN (@orig_lat - @dist) AND (@orig_lat + @dist) AND longitude BETWEEN (@orig_lon - @dist) AND (@orig_lon + @dist) /* HAVING clause to filter result using the more precise euclidian distance */ HAVING distance < @dist ORDER BY distance limit 10; 

For those who are interested in a constant:

  • 3956 is the radius of the earth in miles, so the distance obtained is measured in miles
  • 6371 is the radius of the Earth in kilometers, so use this constant to measure distance in kilometers.

Find More Information on the Haversin Formula Wiki

+3
source

Source: https://habr.com/ru/post/922032/


All Articles