Python pickle vs sql efficiency

Question

Python pickle vs sql efficiency

I am developing a Python application that requires storing (very) large data sets. Parses the most practical way to store data and retrieve it on demand, or should I use SQL instead? My main goals are speed and as little processing as possible.

My concern is that the brine should process the entire large file on the fly, which can adversely affect performance. I am not particularly familiar with pickling outside of use, so any explanation of how this works would be great.

I am using this code now:

users = pickle.load( open( "users.py", "rb" ) ) username = raw_input("Please enter a username: ") password = raw_input("Please enter a password: ") var = username in users if(var == 0): return 0 exit() else: if(users[username] != password): return 0 exit() else: return 1 exit()

Showing users containing 1 million records, which would be more efficient, is it SQL?

Any help would be great

thanks

+4

python sql pickle

user2330561 May 11 '13 at 12:32

source share

2 answers

Tom dalton · Answer 1 · 2013-05-11T12:42:20+0000

Pickle, as a rule, is suitable for storing objects, if you want to efficiently store raw data and then crumbling is probably not the way to go, but it really depends on the specific situation - this is “loading” critical data time, you have development time to create a database, queries, etc.

If your data is a million pairs of username and date of birth, then pickle is probably not the best way to go, it would be easier to store the data in a flat text file.

Both pickle and db / SQL solutions have the advantage of being extensible. Remember that pickle is not "safe", so you should consider the reliability of the file, for example. Will it be transmitted between different systems.

In general, if your data sets are very large, relational Db may be more suitable than brine, but you can also consider other storage mechanisms, for example. Redis, MongoDb, Memcached. They are all highly dependent on the situation, so it may be useful for you to learn more about how the data will be used.

Salem · Answer 2 · 2013-05-11T12:48:30+0000

Since you are looking for some kind of user in the users object, I think SQL would be a better solution.

Assuming users is an array, you have to look for that user from beginning to end of the array. With SQL, you have the ability to add indexes, which, depending on how you model your custom object, can give you a little momentum.

Pickle will also analyze, recreate, and load stored objects, so only the cost of loading, which (both in processor power and in memory) is likely to make it worse.

Python pickle vs sql efficiency

More articles: