I would suggest that this is a hybrid approach. As requests are completed, two checks are performed. The first in the local cache and the second in the MongoDB store. If the first crashes, but the second completes successfully, add it to the memory cache. Over time, the application will be “to blame” for the most common “bad passwords” / entries.
This has two advantages:
1) Common words are quickly rejected from memory.
2) The cost of launching is close to zero and amortized by many requests.
When saving a list of words in MongoDB, I would make the _id field hold each word. By default, you will get an ObjectId, which is a complete departure in this case. Then we can also use the automatic index on _id. I suspect that the poor performance you saw was due to the fact that there was no pointer in the "pass" field. You can also try adding it to the 'pass' field:
mongo.db.passwords.create_index("pass")
To complete the _id script: insert the word:
mongo.db.passwords.insert( { "_id" : "password" } );
The queries look like this:
mongo.db.passwords.find( { "_id" : request.form["password"] } )
As @Madarco mentioned, you can also shave one more bit during the query, ensuring that the results are returned from the index, restricting the returned fields to only the _id field ( { "_id" : 1} ).
mongo.db.passwords.find( { "_id" : request.form["password"] }, { "_id" : 1} )
HTH - Rob
PS I am not a Python / Pymongo expert, so it may not have the correct syntax 100%. Hope this is still helpful.
source share