Generator Performance vs List Python

Currently, I learned about generators and understanding of lists, and also fiddled with the profiler to see that performance gains had stumbled upon this cProfile sum of prime numbers in a large range, using both.

I see that in the generator: 1 genexpr as a cumulative time path is shorter than in its list, but the second line is what puzzles me. It makes a call, which, in my opinion, is a number check, is simple, but then it should not be different: 1 module in the list comprehension?

Am I missing something on my profile?

In [8]: cProfile.run('sum((number for number in xrange(9999999) if number % 2 == 0))') 5000004 function calls in 1.111 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 5000001 0.760 0.000 0.760 0.000 <string>:1(<genexpr>) 1 0.000 0.000 1.111 1.111 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.351 0.351 1.111 1.111 {sum} In [9]: cProfile.run('sum([number for number in xrange(9999999) if number % 2 == 0])') 3 function calls in 1.123 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 1.075 1.075 1.123 1.123 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.048 0.048 0.048 0.048 {sum} 
+6
source share
1 answer

First of all, calls relate to the next method (or __next__ in Python 3) of the generator object, not for checking an even number.

In Python 2, you will not get any extra string for list comprehension (LC), because LC does not create any object, but in Python 3 you will, because now, to make it look like a generator expression, an additional code object ( <listcomp> ) is also created for LC.

 >>> cProfile.run('sum([number for number in range(9999999) if number % 2 == 0])') 5 function calls in 1.751 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 1.601 1.601 1.601 1.601 <string>:1(<listcomp>) 1 0.068 0.068 1.751 1.751 <string>:1(<module>) 1 0.000 0.000 1.751 1.751 {built-in method exec} 1 0.082 0.082 0.082 0.082 {built-in method sum} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} >>> cProfile.run('sum((number for number in range(9999999) if number % 2 == 0))') 5000005 function calls in 2.388 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 5000001 1.873 0.000 1.873 0.000 <string>:1(<genexpr>) 1 0.000 0.000 2.388 2.388 <string>:1(<module>) 1 0.000 0.000 2.388 2.388 {built-in method exec} 1 0.515 0.515 2.388 2.388 {built-in method sum} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 

The number of calls is different, although 1 (LC) compared to 5000001 in the generator expression, this is most because sum consumes an iterator, so it must call its method __next__ 500000 + 1 times (the last 1, probably for StopIteration , to finish the iteration ) To understand the list, all the magic happens inside its code object, where LIST_APPEND helps it add items one by one to the list, i.e. There are no visible calls for cProfile .

+5
source

Source: https://habr.com/ru/post/986784/


All Articles