Without any tests, I have a few guesses. For frozenset
s cpython saves the hash after it has been calculated; in addition, iteration over a set of any kind carries additional overhead because data is rarely stored. In a set of 2 elements, which imposes a significant performance limit on the first hash, but is likely to make the second hash very fast - at least when the object itself is the same. (i.e. is not new, but equivalent to phenisone.)
For tuple
s, cpython does not save the hash, but calculates it every time . So it may happen that re-hashing will be a bit cheaper than freezones. But for such a short tuple, there is probably almost no difference; it is even possible that very short tuples will be faster.
Lattyware current timings are pretty good with my line of reasoning; see below.
To test my intuition about the asymmetry of hashing new and old phenisons, I did the following. I believe that the difference in timings is due exclusively to the additional hashing time. Which is pretty minor, by the way:
>>> fs = frozenset((1, 2)) >>> old_fs = lambda: [frozenset((1, 2)), fs][1] >>> new_fs = lambda: [frozenset((1, 2)), fs][0] >>> id(fs) == id(old_fs()) True >>> id(fs) == id(new_fs()) False >>> %timeit hash(old_fs()) 1000000 loops, best of 3: 642 ns per loop >>> %timeit hash(new_fs()) 1000000 loops, best of 3: 660 ns per loop
Please note that my previous timings were incorrect; using and
, created an asymmetry of time that the method described above avoids. This new method gives the expected results for the tuples here - a careless time difference:
>>> tp = (1, 2) >>> old_tp = lambda: [tuple((1, 2)), tp][1] >>> new_tp = lambda: [tuple((1, 2)), tp][0] >>> id(tp) == id(old_tp()) True >>> id(tp) == id(new_tp()) False >>> %timeit hash(old_tp()) 1000000 loops, best of 3: 533 ns per loop >>> %timeit hash(new_tp()) 1000000 loops, best of 3: 532 ns per loop
And, coup de grace, comparing the hash time for a pre-built freezonet with the hash time for a pre-built tuple:
>>> %timeit hash(fs) 10000000 loops, best of 3: 82.2 ns per loop >>> %timeit hash(tp) 10000000 loops, best of 3: 93.6 ns per loop
Lattyware results look more like this because they are average results for new and old freezines. (They hash every tuple or frozenset twice, once when creating a dictionary, once when accessing it.)
The result of all this is that it probably doesn't matter, except for those of us who like to delve into the insides of Python and test things in oblivion.