Python Performance: Try-except or not in?

In one of my classes, I have a number of methods that all draw values ​​from the same dictionaries. However, if one of the methods tries to access a value that is not there, it must call another method to make the value associated with this key.

This is currently implemented as follows, where findCrackDepth (tonnage) is set to self.lowCrackDepth [tonnage].

if tonnage not in self.lowCrackDepth:
    self.findCrackDepth(tonnage)
lcrack = self.lowCrackDepth[tonnage]

However, it would be possible for me to do this as

try:
    lcrack = self.lowCrackDepth[tonnage]
except KeyError:
    self.findCrackDepth(tonnage)
    lcrack = self.lowCrackDepth[tonnage]

I assume that the difference in performance between the two is related to how often the values ​​are already in the dictionary. How big is this difference? I generate several millions of such values ​​(scattered across many dictionaries in many instances of the class), and each time the value does not exist, probably twice when it does.

+3
source share
5 answers

This is a delicate problem because you need to avoid “long lasting side effects,” and the performance tradeoff depends on the% missing keys. So, consider the file dil.pyas follows:

def make(percentmissing):
  global d
  d = dict.fromkeys(range(100-percentmissing), 1)

def addit(d, k):
  d[k] = k

def with_in():
  dc = d.copy()
  for k in range(100):
    if k not in dc:
      addit(dc, k)
    lc = dc[k]

def with_ex():
  dc = d.copy()
  for k in range(100):
    try: lc = dc[k]
    except KeyError:
      addit(dc, k)
      lc = dc[k]

def with_ge():
  dc = d.copy()
  for k in range(100):
    lc = dc.get(k)
    if lc is None:
      addit(dc, k)
      lc = dc[k]

and a series of calls timeit, such as:

$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_in()'
10000 loops, best of 3: 28 usec per loop
$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_ex()'
10000 loops, best of 3: 41.7 usec per loop
$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_ge()'
10000 loops, best of 3: 46.6 usec per loop

, 10% in , , .

$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_in()'
10000 loops, best of 3: 24.6 usec per loop
$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_ex()'
10000 loops, best of 3: 23.4 usec per loop
$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_ge()'
10000 loops, best of 3: 42.7 usec per loop

1% , exception ( get ).

, , (99% +) , in .

, , : dict, ...:

class dd(dict):
   def __init__(self, *a, **k):
     dict.__init__(self, *a, **k)
   def __missing__(self, k):
     addit(self, k)
     return self[k]

def with_dd():
  dc = dd(d)
  for k in range(100):
    lc = dc[k]

...

$ python -mtimeit -s'import dil; dil.make(1)' 'dil.with_dd()'
10000 loops, best of 3: 46.1 usec per loop
$ python -mtimeit -s'import dil; dil.make(10)' 'dil.with_dd()'
10000 loops, best of 3: 55 usec per loop

... -, - get , . (defaultdict, dd, , , , __missing__ C-).

+14

, , , . , , .

, python ( ), , , - .

+3

, .

, , .

+2

, . , , try/except, , , not in.

+1

I believe that the method .get()for dict has a parameter to set the default value. You can use this and have it on one line. I'm not sure how this affects performance, though.

0
source

Source: https://habr.com/ru/post/1751597/


All Articles