What is the best way to design a date / geographic proximity query by GAE?

Question

What is the best way to design a date / geographic proximity query by GAE?

I am creating a catalog for searching GAE sports tournaments with web2py and front end Flex. The user selects a location, radius and maximum date from a set of options. I have a basic version of this request, but it is inefficient and slow. One of the ways that I know I can improve this by condensing many individual queries, which I use to assemble objects into bulk queries. I just found out that this is possible. But I also think of a more extensive redesign that uses memcache.

The main problem is that I cannot query the data store by location because the GAE does not allow multiple numeric comparison operators (<, <=,> =,>) in a single query. I already use one for the date, and I need TWO to check both latitude and longitude, so this is not an option. At the moment, my algorithm is as follows:

1.) Request by date and select

2.) Use the assignment function from the geophysical distance module to find the maximum and minimum latitude and longitude for a given distance

3.) Scroll through the results and delete everything with lat / lng outside max / min

4.) Repeat the cycle and use the distance function to accurately check the distance, as step 2 will include some areas outside the radius. Delete the results from a distance outside (is this a 2/3/4 inefficent combination?)

5.) Gather the many-to-many list and attach to the objects (this is where I need to switch to mass operations)

6.) Return to customer

Here is my plan for using memcache .. let me know if I am leaving the left field on this, since I have no prior experience with memcache or the caching server in general.

-See the list in the cache filled with "geo objects" that represent all my data. They have five properties: latitude, longitude, event_id, event_type (pending expansion outside of tournaments) and start date. This list will be sorted by date.

. Also keep the pointer pointer in the cache, which is the beginning and ending indexes in the cache for all date ranges used by my application (next week, 2 weeks, month, 3 months, 6 months, year, 2 years).

- A scheduled task that updates pointers daily at 12 o’clock.

- add new inserts to the cache, as well as data storage; Update pointers.

Using this design, the algorithm will now look like this:

1.) Use the pointers to cut off the corresponding portion of the list based on the delivery date.

2-4.) Same as above, except for geo objects

5.) Use the bulk operation to select complete tournaments using the remaining event_ids geo objects

6.) Collect many-in-manys

7.) Return to customer

Thoughts on this approach? Thanks so much for reading and any advice you can give.

-Dane

+1

python google-app-engine web2py caching google-cloud-datastore

Dane Mar 26 '10 at 18:28

source share

2 answers

GeoModel is the best I have found. You can see how my GAE application returns geospatial queries. For example, an http request from India with an additional cc code (country code) using the geomodel library lat=20.2095231&lon=79.560344&cc=IN

+2

Niklas Rosencrantz Mar 28 '10 at 21:48

source share

Emilien · Accepted Answer · 2010-03-29T12:29:08+0000

You may be interested in geohash , which allows you to query inequality as follows:

SELECT latitude, longitude, name FROM myMarkers WHERE geohash> =: sw_geohash AND geohash <=: ne_geohash

Check out this great article that was featured this month by the Google App Engine App Engine Community Community .

As a note to the proposed design, do not forget that objects in Memcache do not have a guarantee to remain in memory and that you cannot sort them by date.

What is the best way to design a date / geographic proximity query by GAE?

More articles: