Compromise in Python Dictionary Keys

Let's say I'm going to build a probably large Python 3 dictionary for memory operations. The dictionary keys are integers, but I will first read them from the file as strings.

Regarding storage and search, I wonder if it is important to store dictionary keys as integers or as strings.
In other words, leaving them whole, help with hashing?

+5
source share
3 answers

Dictations are quick, but can be hard to remember. Usually this should not be a problem, but you will only know when testing. I would suggest testing 1,000 lines, 10,000 lines, etc. first. And look at the amount of memory.

If you run out of memory and your data structure allows, perhaps try using named tuples .

EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade') import csv for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))): print(emp.name, emp.title) 

(Example taken from the link)

If you have ascending integers, you can also try to get more fancy using the array module .

+3
source

Actually hashing strings is pretty effective in Python 3. I expected this to have the opposite result:

 >>> timeit('d["1"];d["4"]', setup='d = {"1": 1, "4": 4}') 0.05167865302064456 >>> timeit('d[1];d[4]', setup='d = {1: 1, 4: 4}') 0.06110116100171581 
+1
source

You don't seem tired of comparing alternatives. It turns out that the difference is quite small, and I also find conflicting differences. In addition, this is detailed implementation information, since both integers and strings are immutable, they could be compared as pointers.

What you have to consider is one that is a natural choice of key. For example, if you do not interpret the key as a number elsewhere, there is little reason to convert it to an integer.

In addition, you should consider whether you want keys to be considered equal if their numeric value is the same or if they should be lexically identical. For example, if you think 00 the same key as 0 , you will need to interpret it as an integer, and then the integer is the correct key, if, on the other hand, you want to consider them different, then it would be completely wrong to convert them into integers (as they become the same).

+1
source

Source: https://habr.com/ru/post/1238941/


All Articles