If a == b or == c: vs, if a in {b, c}:

Question

If a == b or == c: vs, if a in {b, c}:

In my code, I used to make type comparisons if a == b or a == c or a == d:quite often. At some point, I found that they can easily be reduced to if a in {b, c, d}:or if a in (b, c, d):if the values are not hashed. However, I have never seen such a design in any other code. This is likely due to the fact that:

The method ==works more slowly.
The method is ==more pythonic.
In fact, they do subtly different things.
I, by chance, did not look at any code that also required.
I saw this and simply ignored or forgot it.
You do not need to have such comparisons, because one code may be better in another place.
~~No one thought about in, except me.~~

What reason, if any, is there?

+4

python

Leopardshark Aug 21 '17 at 15:03

source share

3 answers

Streaming: "in" is better

timeit.timeit("pub='1'; pub == 1 or pub == '1'")
0.07568907737731934
timeit.timeit("pub='1'; pub in[1, '1']")
0.04272890090942383
timeit.timeit("pub=1; pub == 1 or pub == '1'")
0.07502007484436035
timeit.timeit("pub=1; pub in[1, '1']")
0.07035684585571289

"in" , == 1 == 2. . "in" . , . , "in" , .

+2

Surjit R 21 . '17 15:15

, .

: .

, , . , .

, , , .

$ speed.py
inarray                   x 1000000:  0.277590343844
comparison                x 1000000:  0.347808290754
makearray                 x 1000000:  0.408771123295

import timeit

NUM = 1000000

a = 1
b = 2
c = 3
d = 1

array = {b,c,d}
tup = (b,c,d)
lst = [b,c,d]

def comparison():
    if a == b or a == c or a == d:
        pass

def makearray():
    if a in {b, c, d}:
        pass

def inarray():
    if a in array:
        pass

def maketuple():
    if a in (b,c,d):
        pass

def intuple():
    if a in tup:
        pass

def makelist():
    if a in [b,c,d]:
        pass

def inlist():
    if a in lst:
        pass


def time_all(funcs, params=None):
    timers = []
    for func in funcs:
        if params:
            tx = timeit.Timer(lambda: func(*params))
        else:
            tx = timeit.Timer(lambda: func())
        timers.append([func, tx.timeit(NUM)])

    for func, speed in sorted(timers, key=lambda x: x[1]):
        print "{fn:<25} x {n}: ".format(fn=func.func_name, n=NUM), speed
    print ""
    return

time_all([comparison,
          makearray,
          inarray,
          intuple,
          maketuple,
          inlist,
          makelist
          ], 
         )

This will not quite answer your question regarding the reason why you do not often see comparisons using. I would speculate, but this is probably a mixture of 1,2,4, and a situation where the author needed to write this specific code.

I personally used both methods depending on the situation. The choice usually came down to speed or simplicity.

edit:

@ bracco23 is right, there are slight differences where the use of tuples vs array vs list will change the time.

$ speed.py
inarray                   x 1000000:  0.260784980761
intuple                   x 1000000:  0.288696420718
inlist                    x 1000000:  0.311479982167
maketuple                 x 1000000:  0.356532747578
comparison                x 1000000:  0.360010093964
makearray                 x 1000000:  0.41094386108
makelist                  x 1000000:  0.433603059099

+2

Marcel wilson Aug 21 '17 at 15:35

source share

Eugene yarmash · Accepted Answer · 2017-08-21T15:09:09+0000

For simple values (i.e. not expressions or NaNs), if a == b or a == cand are if a in <iterable of b and c>equivalent.

If values are hashed, it is best to use inwith a set of literals instead of tuples or lists:

if a in {b, c}: ...

The CPython peephole optimizer can often replace this with a cached one frozenset(), and set membership tests are O (1) operations.

If a == b or == c: vs, if a in {b, c}:

More articles: