Why is this algorithm worse?

Question

Why is this algorithm worse?

In Wikipedia, this is one of the predefined algorithms for generating prime numbers:

def eratosthenes_sieve(n): # Create a candidate list within which non-primes will be # marked as None; only candidates below sqrt(n) need be checked. candidates = [i for i in range(n + 1)] fin = int(n ** 0.5) # Loop over the candidates, marking out each multiple. for i in range(2, fin + 1): if not candidates[i]: continue candidates[i + i::i] = [None] * (n // i - 1) # Filter out non-primes and return the list. return [i for i in candidates[2:] if i]

I changed the algorithm a bit.

 def eratosthenes_sieve(n): # Create a candidate list within which non-primes will be # marked as None; only candidates below sqrt(n) need be checked. candidates = [i for i in range(n + 1)] fin = int(n ** 0.5) # Loop over the candidates, marking out each multiple. candidates[4::2] = [None] * (n // 2 - 1) for i in range(3, fin + 1, 2): if not candidates[i]: continue candidates[i + i::i] = [None] * (n // i - 1) # Filter out non-primes and return the list. return [i for i in candidates[2:] if i]

First I highlighted all the multiples of 2, and then I looked at the odd numbers. When I timed both algorithms (tried 40,000,000), the first one was always better (although very little). I do not understand why. Can someone explain?

PS: When I try 100,000,000, my computer freezes. Why is this? I have a Core Duo E8500, 4 GB of RAM, Windows 7 Pro 64 bit.

Update 1: this is Python 3.

Update 2: so I was configured:

 start = time.time() a = eratosthenes_sieve(40000000) end = time.time() print(end - start)

UPDATE:. From valuable comments (especially the nightmare and Winston Evert), I was able to determine what I intended in the first place:

 def eratosthenes_sieve(n): # Create a candidate list within which non-primes will be # marked as None; only c below sqrt(n) need be checked. c = [i for i in range(3, n + 1, 2)] fin = int(n ** 0.5) // 2 # Loop over the c, marking out each multiple. for i in range(fin): if not c[i]: continue c[c[i] + i::c[i]] = [None] * ((n // c[i]) - (n // (2 * c[i])) - 1) # Filter out non-primes and return the list. return [2] + [i for i in c if i]

This algorithm improves the original algorithm (mentioned above) by 50%. (However, worse than the algorithm mentioned by the night scraper, of course).

Question for Python Masters: is there a more Pythonic way to express this last code in a more functional way?

UPDATE 2: I still could not decode the algorithm mentioned by the night scraper. I guess I'm too stupid.

+4

python algorithm primes

blackened Mar 24 '11 at 22:14

source share

4 answers

You do not save much time by avoiding alignment. Most of the computation time in the algorithm is spent on this:

 candidates[i + i::i] = [None] * (n // i - 1)

This line causes a lot of action on the part of the computer. Whenever the number in question is uniform, this is not true, since the loopback instructions in the if statement. Thus, the time spent on a cycle for even numbers is really very small. Therefore, the elimination of these even rounds does not lead to a significant change in cycle time. That is why your method is not much faster.

When python produces numbers for a range, it uses the formula: start + index * step. Multiplying by two, as in your case, will be slightly more expensive than in the original case.

There is also, possibly, a little overhead in order to have a longer function.

None of these are really important speed issues, but they override the very small number of benefits that your version brings.

+2

Winston ewert Mar 24 '11 at 23:32

source share

It is probably a little slower because you are doing an extra setup to do something that was done in the first case, in any case (markup divisible by two). This tuning time may be what you see if it is as weak as you say

0

kniteli Mar 24 '11 at 10:25

source share

Your extra step is not needed and will actually traverse the entire collection n when you do the operation “get rid of evens”, and not just work with n ^ 1/2.

0

Ichorus Mar 24 '11 at 22:26

source share

orlp · Accepted Answer · 2011-03-24T22:40:00+0000

The question is, why even faster? In both examples, you are filtering a multiple of two, the hard way. It doesn’t matter if you specify candidates[4::2] = [None] * (n // 2 - 1) or that it will be executed in the first for i in range(2, fin + 1): loop.

If you are interested in the optimized Eratosthenes sieve, you are here:

 def primesbelow(N): # /questions/14495/fastest-way-to-list-all-primes-below-n/105138#105138 #""" Input N>=6, Returns a list of primes, 2 <= p < N """ correction = N % 6 > 1 N = (N, N-1, N+4, N+3, N+2, N+1)[N%6] sieve = [True] * (N // 3) sieve[0] = False for i in range(int(N ** .5) // 3 + 1): if sieve[i]: k = (3 * i + 1) | 1 sieve[k*k // 3::2*k] = [False] * ((N//6 - (k*k)//6 - 1)//k + 1) sieve[(k*k + 4*k - 2*k*(i%2)) // 3::2*k] = [False] * ((N // 6 - (k*k + 4*k - 2*k*(i%2))//6 - 1) // k + 1) return [2, 3] + [(3 * i + 1) | 1 for i in range(1, N//3 - correction) if sieve[i]]

Explanation here: Porting an optimized sieve of eratosthenes from Python to C ++

The source is here , but there was no explanation. In short, this primesieve skips the multiple of 2 and 3 and uses multiple hacks to use Python's fast assignment.

Why is this algorithm worse?

More articles: