I am embarrassed.
Today I noticed that some of the data that I thought should be present in my appengine application did not appear. I connected to the application through the remote console and ran the requests manually. Of course, it looked as if I had only 15 of the 101 rows that I expected to see.
Then I went to the admin console on appengine.google.com and ran the data warehouse viewer with the following request:
SELECT * FROM Assignment where game = KEY('Game', '201212-foo') and player = KEY('Player', 'player-mb')
The result that I see is the first page of 20 results. I look at these results and can see all 101 objects. HOORAY! My data is still there. BUT why then I canβt access it through db api? (NOTE: I already tried flushing memcache through the memcache viewer, although this particular request was not manually memcached)
From a remote console:
> from google.appengine.ext.db import GqlQuery > GqlQuery("SELECT * FROM Assignment WHERE game = KEY('Game', '201212-foo') and player = KEY('Player', 'player-mb')").count() 15
The remote console is consistent with the application itself, which, apparently, can see 15 expected 101 lines.
What gives?
UPDATE
I suspect this may be an indexing issue. If I issue get_by_key_name for one of the missing lines, it subsequently appears in db api requests.
> GqlQuery("SELECT * FROM Assignment WHERE game = KEY('Game', '201212-foo') and player = KEY('Player', 'player-mb')").count() 15 > entities.Assignment.get_by_key_name('201212-assignment-135.9') <entities.Assignment object at 0xa11eb6c> > GqlQuery("SELECT * FROM Assignment WHERE game = KEY('Game', '201212-foo') and player = KEY('Player', 'player-mb')").count() 16
So should I (or can I) rebuild my indexes to fix this problem?
UPDATE # 2 :
I tried to build the perfect index for this query and just confirmed that even when the query uses the newly built index (via query.index_list ()), the results are still limited to the small subset available through the data warehouse viewer. Worried, this is actually a different subset than is available with the previous index (20 items versus 15 items). So, now adding an additional filter member will return another 5 rows. So dumb.
All indexes claim to be "serving", so there should be no reason why the indexes are far away.
UPDATE # 3 :
Sometimes, using my new index, I get the correct answer:
> GqlQuery("SELECT * FROM Assignment WHERE game = KEY('Game', '201212-foo') and player = KEY('Player', 'player-mb') and user = 'zee'").count() 101
However, if I issue this query 10 times, it returns with βbadβ results in about half the cases:
> GqlQuery("SELECT * FROM Assignment WHERE game = KEY('Game', '201212-foo') and player = KEY('Player', 'player-mb') and user = 'zee'").count() 16
So maybe this is a bad / lagging replica problem that I hit half the time or something else completely opaque that we wonβt get an answer (appengine status shows a service violation today), but I feel that it will be fixed on my own. Will be updated if this happens.
FINAL UPDATE :
As I suspected, when I woke up this morning, my application (and manual queries) now sees a consistent, correct look at the data. I would like to continue to love the answer why this happened, but until I get it, I'm going to write it down in Googleβs inner tremendous oddity.
I filed this problem against appengine to find out if I can get a response from someone I know.