I spent some time on the source code, and I think I understand what is going wrong. First, I made the erroneous assumption that the distances calculated by the .us geocoder will be the same as what lucene internally calculates as the distances between points. Values ββare close but not accurate. So I switched to calculating the distances between lat / lon pairs by calling lucene
double distance = DistanceUtils.getInstance().getDistanceMi(lat1,lon1,lat2,lon2);
Then I dug into the DistanceQueryBuilder class http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-spatial/2.9.4/org/apache/lucene/spatial/tier/DistanceQueryBuilder. java? av = f , which I think has an error.
It computes the bounding box to select the Cartesian tiers as follows:
CartesianPolyFilterBuilder cpf = new CartesianPolyFilterBuilder(tierFieldPrefix); Filter cartesianFilter = cpf.getBoundingArea(lat, lng, miles);
And that is pretty clear by looking at LLRect.createBox http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-spatial/2.9.4/org/apache/lucene/spatial/ geometry / shape / LLRect.java # LLRect.createBox% 28org.apache.lucene.spatial.geometry.LatLng% 2Cdouble% 2Cdouble% 29 so that the third parameter getBoudningArea will be considered as the full width / height of the frame. Thus, passing the radius value results in a too small bounding box.
The fix was to provide an alternative version of DistanceQueryBuilder that does this:
Filter cartesianFilter = cpf.getBoundingArea(lat,lng,miles*2);
This seems to work. I am still convinced that DistanceApproximation http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-spatial/2.9.4/org/apache/lucene/spatial/geometry/shape /DistanceApproximation.java#DistanceApproximation.getMilesPerLngDeg%28double%29 does not work, because it seems that the following operations should be reversible, but it is not:
But this is not so. For example, if the above code is set to lat = 34, lng = -118 and radius = 25 (and instead of saying that I just print the results), I get:
Lng delta: 0.36142327178505024, dist: 20.725929003138496 Lat delta: 0.4359569489852007, dist: 30.155567734407825
I assume that the code only works because the Cartesian levels selected after selecting the bounding box will lead to an area slightly larger than the bounding box. But I do not think it will be guaranteed.
I hope someone who has more knowledge about this can comment, because these are just observations after he dug the code for a day. I noticed that what looks like the latest code for lucene spaces is on googlecode at: http://code.google.com/p/spatial-search-lucene/ , and it seems that the implementation has changed significantly, but I did not go too deep into the details.