Why is _ for range (n) slower than for _ in [""] * n?

Question

Why is _ for range (n) slower than for _ in [""] * n?

Testing for _ in range(n) alternatives (to perform some action n times, even if the action does not depend on the value of n ), I noticed that there is another wording of this template that is faster, for _ in [""] * n

For instance:

 timeit('for _ in range(10^1000): pass', number=1000000)

returns 16.4 seconds;

then,

 timeit('for _ in [""]*(10^1000): pass', number=1000000)

takes 10.7 seconds.

Why is [""] * 10^1000 so much faster than range(10^1000) in Python 3?

All tests performed using Python 3.3

+6

python python-internals

Dunedubby May 22, '15 at 14:57

source share

2 answers

Your problem is that you are feeding timeit .

You need to specify timeit strings containing Python instructions. If you do

 stmt = 'for _ in ['']*100: pass'

Look at the value of stmt . Quotation marks inside square brackets correspond to line separators, so they are interpreted as Python line terminators. Since Python concatenates adjacent string literals, you will see that what you really have is the same as 'for _ in [' + ']*100: pass' , which gives you 'for _ in []*100: pass' .

So your “super-fast” loop just iterates over an empty list, not a list of 100 items. Try your test, for example

 stmt = 'for _ in [""]*100: pass'

+10

John y May 22, '15 at 16:14

source share

Martijn pieters · Accepted Answer · 2015-05-22T16:30:30+0000

When iterating over range() , objects are created for all integers from 0 to n ; it takes a (small) amount of time, even if small integers have been cached .

On the other hand, a loop through [None] * n creates n links to 1 object, and creating this list is a little faster.

However, the range() object uses much less memory and is more readable for loading, so people prefer to use this. Most codes should not squeeze every last frame out of performance.

If you need this speed, you can use a custom iterative that doesn't accept memory using itertools.repeat() with a second argument

 from itertools import repeat for _ in repeat(None, n):

As for your time tests, there are problems with them.

First of all, you made a mistake in the synchronization cycle ['']*n ; you did not insert two quotation marks, you combined the two lines and created an empty list:

 >>> '['']*n' '[]*n' >>> []*100 []

This will be invincible in iteration since you repeated 0 times.

You also did not use large numbers; ^ is a binary XOR operator, not a power operator:

 >>> 10^1000 994

which means your test missed how long it took to create a large list of empty values.

Using the best numbers and None gives you:

 >>> from timeit import timeit >>> 10 ** 6 1000000 >>> timeit("for _ in range(10 ** 6): pass", number=100) 3.0651066239806823 >>> timeit("for _ in [None] * (10 ** 6): pass", number=100) 1.9346517859958112 >>> timeit("for _ in repeat(None, 10 ** 6): pass", 'from itertools import repeat', number=100) 1.4315521717071533

Why is _ for range (n) slower than for _ in [""] * n?

More articles: