Why dict.get (key) is slower than dict [key]

Question

Why dict.get (key) is slower than dict [key]

When starting the numerical integrator, I noticed a noticeable difference in speed depending on how I retrieve the field value in the dictionary

import numpy as np def bad_get(mydict): '''Extract the name field using get()''' output = mydict.get('name', None) return output def good_get(mydict): '''Extract the name field using if-else''' if 'name' in mydict: output = mydict['name'] else: output = None return output name_dict = dict() name_dict['name'] = np.zeros((5000,5000))

On my system, I notice the following difference (using iPython)

 %%timeit bad_get(name_dict) The slowest run took 7.75 times longer than the fastest. This could mean that an intermediate result is being cached. 1000000 loops, best of 3: 247 ns per loop

Compared with

 %%timeit good_get(name_dict) 1000000 loops, best of 3: 188 ns per loop

This may seem like a small difference, but for some arrays the difference seems even more dramatic. What causes this behavior, and somehow I have to change my use of the get() function?

+5

performance python dictionary

wil3 Apr 12 '16 at 7:24

source share

1 answer

Martijn pieters · Accepted Answer · 2016-04-12T07:29:13+0000

Python should do more work for dict.get() :

get is an attribute, so Python should look at this and then bind the handle found to the dictionary instance.
() is a call, so the current frame must be pushed onto the stack, the call must be made, then the frame must be taken out of the stack again to continue.

The notation [...] used with dict does not require a separate step or attribute step, click and place the frame.

You can see the difference when you use the Python dis bytecode disassembler dis :

 >>> import dis >>> dis.dis(compile('d[key]', '', 'eval')) 1 0 LOAD_NAME 0 (d) 3 LOAD_NAME 1 (key) 6 BINARY_SUBSCR 7 RETURN_VALUE >>> dis.dis(compile('d.get(key)', '', 'eval')) 1 0 LOAD_NAME 0 (d) 3 LOAD_ATTR 1 (get) 6 LOAD_NAME 2 (key) 9 CALL_FUNCTION 1 12 RETURN_VALUE

therefore, the expression d[key] should only execute the operation code BINARY_SUBSCR , and d.get(key) adds the operation code LOAD_ATTR . CALL_FUNCTION much more expensive than BINARY_SUBSCR for the built-in type (custom types with __getitem__ methods still end up calling the function).

If most of your keys exist in the dictionary, you can use try...except KeyError to handle missing keys:

 try: return mydict['name'] except KeyError: return None

Exception handling is cheap if there are no exceptions.

Why dict.get (key) is slower than dict [key]

More articles: